Champions: 1991, 1992, 1993, 1996, 1997, 1998
Conference Titles: 1991, 1992, 1993, 1996, 1997, 1998
Division Titles: 1975, 1991, 1992, 1993, 1996, 1997, 1998, 2011, 2012
This assessment task allows you to consolidate and apply the concepts and skills you’ve learnt throughout the semester. This assessment requires you to generate a reproducible data analysis project.
Your reproducible data analysis project will be hosted as a repository on GitHub and you are required to submit the URL to your GitHub repository.
You are a data analyst with the Chicago Bulls competing in the NBA (national basketball association). In the most recent NBA season (2018-19), your team placed 27th out of 30 (for win-loss record). Your team’s budget for player contracts next season is $118 million, ranked 26th out of 30 (for the purpose of this assignment, next season is 2019-20). For context, the team with the highest payroll budget is Portland with $148 million, while the best performing team was Milwaukee Bucks (who clinched the best league record in 2018-19 who clinched the best league record in 2018-29) with $131 million.
You have been tasked by the general manager of Chicago Bulls to find the best five starting players one from each position) your team can afford. (Make sure you don’t use up all of your money on just these five players, you still need to fill a full team roster, but are just focussed on finding five starting players here). You can choose players that are already playing for Chicago Bulls, you just need to prove that they are worth it.
## ── Attaching packages ──────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.4
## ✓ tibble 3.0.1 ✓ dplyr 0.8.5
## ✓ tidyr 1.1.0 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ─────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(prettydoc)
library(broom)
library(dplyr)
library(ggplot2)
library(knitr)
library(kableExtra)##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
We have been provided the following data sets: 1. 2018-19_nba_player-statistics.csv : sourced from basketball-reference.com
2018-19_nba_player-salaries.csv : sourced from hoopshype.com/salaries
2019-20_nba_team-payroll.csv : sourced from hoopshype.com/salaries
2018-19_nba_team-statistics_1.csv : sourced from basketball-reference.com
2018-19_nba_team-statistics_2.csv : sourced from basketball-reference.com
Read in the various files using the read_csv() function from the readr package.
## Parsed with column specification:
## cols(
## .default = col_double(),
## player_name = col_character(),
## Pos = col_character(),
## Tm = col_character()
## )
## See spec(...) for full column specifications.
## Warning: Missing column names filled in: 'X4' [4], 'X5' [5], 'X6' [6], 'X7' [7]
## Parsed with column specification:
## cols(
## player_id = col_double(),
## player_name = col_character(),
## salary = col_double(),
## X4 = col_logical(),
## X5 = col_logical(),
## X6 = col_logical(),
## X7 = col_logical()
## )
## Warning: Missing column names filled in: 'X23' [23], 'X24' [24], 'X25' [25]
## Parsed with column specification:
## cols(
## .default = col_double(),
## Team = col_character(),
## X23 = col_logical(),
## X24 = col_logical(),
## X25 = col_logical()
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
## .default = col_double(),
## Team = col_character()
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
## team_id = col_double(),
## team = col_character(),
## salary = col_character()
## )
#Rename the variables to remove % and variables starting with numbers
p_stats <- rename(p_stats,
FGp = 'FG%', x3P = '3P', x3PA = '3PA', x3Pp = '3P%', x2P = '2P', x2PA = '2PA', x2Pp = '2P%', eFGp = 'eFG%', FTp = 'FT%')
team_stats <- rename(team_stats,
x3PAr = '3PAr', TSp = 'TS%', eFGp = 'eFG%', TOVp = 'TOV%', ORBp = 'ORB%', DRBp = 'DRB%')
team_stats_2 <- rename(team_stats_2,
FGp = 'FG%', x3P = '3P', x3PA = '3PA', x3Pp = '3P%', x2P = '2P', x2PA = '2PA', x2Pp = '2P%', FTp = 'FT%')#Replace the NAs found in shooting percentage of players who didn't attempt a particular shot
p_stats <- p_stats %>%
mutate_if(is.numeric, funs(ifelse(is.na(.), 0, .)))## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas:
##
## # Simple named list:
## list(mean = mean, median = median)
##
## # Auto named with `tibble::lst()`:
## tibble::lst(mean, median)
##
## # Using lambdas
## list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
p_sal <- subset(p_sal, select = c(2:3)) # First remove player id as not required data
p_stats <- left_join(x = p_stats, y = p_sal) # This will join the salary to the respective Player## Joining, by = "player_name"
p_stats <- drop_na(p_stats) # We can see that some Player salaries are missing, so we must remove these from our dataset to ensure we are 100% confident that our picks will keep us under the salary cap.
which(is.na(p_stats), arr.ind = TRUE) # Quick test to identify if any NAs remain in the dataset## row col
x <- stringi::stri_trans_general(p_stats$player_name, "Latin-ASCII")
# Saves the player name list in new variable. Removes the accents and saves as a character vector
x_new <- as.data.frame(x, row.names = NULL, optional = FALSE, stringsAsFactors = FALSE) # Converts the vector into a single column data frame
x_new <- stringr::str_replace_all(x_new$x, pattern = "\\.", replacement = "")
# Removes the periodsin players names.
x_new <- as.data.frame(x, row.names = NULL, optional = FALSE, stringsAsFactors = FALSE) # Converts the vector into a single column data frame
x_new <- rename(x_new, player_name = 'x') # Changes column name to merge data framesp_stats <- bind_cols(x = x_new, y = p_stats) # Combines the no accent name variable to the main data frame.
p_stats <- subset(p_stats, select = -c(player_name1)) # Removes the player name variable that had the accents.
p_stats <- p_stats[, c(1,2,3,4,30,5:29)] # Move the salary variable into a more logical position in the table
p_stats <- rename(p_stats, Salary = 'salary') # Rename salary to Salary, to tidy up the appearance slightly.comb_team <- merge(team_stats, team_stats_2, by.x = "Team", by.y = "Team") # Combine the two sheets, matching by the Team names as the order of the two sheets is different.
comb_team <- subset(comb_team, select = -c(2, 23)) # This will remove the 'Ranking' columns that appeared twice. They aren't necessary in this analysis, so have been removed.
comb_team <- comb_team[, c(1, 2, 22:23, 3:21, 24:44)] # The next three pieces of code, will reorganise the data into an order that is preferable for me.
comb_team <- comb_team[, c(1:15, 36:44, 16:35)]
comb_team <- comb_team[, c(1:24, 33:44, 25:32)]p_stats <- p_stats %>%
group_by(player_name) %>%
arrange(player_name, desc(G)) %>%
distinct(player_name, .keep_all = TRUE) # Will remove any duplicate players based on the amount of games played in that row. The highest amount of games played for the duplicated remains. Team variable 'TOT' stands for Two or More Teams so that is the row we want to keep.
p_stats <- p_stats %>%
filter(G >= 20, MP >= 100) # Filter out and remove players that haven't played enough and who's data could influence decisions unnecessarily. p_stats <- p_stats %>%
group_by(Pos) %>%
mutate(PTSpm = PTS / MP,
FTpm = FT / MP,
BLKpm = BLK / MP,
ASTpm = AST / MP,
STLpm = STL / MP,
TOVpm = TOV / MP,
x3Ppm = x3P / MP,
PPG = PTS / G,
APG = AST / G,
RPG = TRB / G) # Creates new variables at the end of our data frame
p_stats <- arrange(p_stats, Pos) # Arranges the data frame in order of Position
p_stats <- p_stats %>%
mutate_if(is.numeric, round, digits = 3)## `mutate_if()` ignored the following grouping variables:
## Column `Pos`
# Formatting the table with a scroll box so that it doesn't take up a considerable amount of the page. I have used the datatable function to include a search bar which will be helpful.
write_csv(x = p_stats, path = "data/processed/p_stats.csv")
write_csv(x = comb_team, path = "data/processed/p_stats.csv") # Writing and saving the combined and new processed/tidy data. Team Statistics
Pace v Wins
Pace_wins <- comb_team %>%
ggplot(aes(Pace, W)) +
geom_point(alpha = 0.5) +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Team Pace Ratings and Wins.
Pace_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.08892848
pace_wins_lm <- lm(W ~ Pace, data = comb_team)
summary(pace_wins_lm) # Creates a linear regression model for Pace v Wins.##
## Call:
## lm(formula = W ~ Pace, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.7808 -7.1784 0.9702 8.7057 17.3618
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -9.220 106.323 -0.087 0.932
## Pace 0.502 1.063 0.472 0.640
##
## Residual standard error: 12.19 on 28 degrees of freedom
## Multiple R-squared: 0.007908, Adjusted R-squared: -0.02752
## F-statistic: 0.2232 on 1 and 28 DF, p-value: 0.6403
## 1 2 3 4 5 6 7 8
## 42.93945 40.78079 41.38320 40.32897 40.47958 39.27475 40.47958 39.82696
## 9 10 11 12 13 14 15 16
## 39.67636 41.43341 39.92736 40.02777 41.83502 42.58804 39.27475 40.07797
## 17 18 19 20 21 22 23 24
## 42.63824 41.08200 42.63824 40.78079 42.38723 40.02777 41.78482 41.23260
## 25 26 27 28 29 30
## 40.52978 42.53784 40.12817 41.08200 41.13220 41.68441
pace_wins_stdres <- rstandard(pace_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
pace_wins_points <- 1:length(pace_wins_stdres) # Gives the length of the variable
pace_wins_labels <- if_else(abs(pace_wins_stdres) >= 1.5, paste(pace_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = pace_wins_points, y = pace_wins_stdres)) +
geom_point() +
geom_text(aes(label = pace_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified. pace_wins_hats <- hatvalues(pace_wins_lm) # Measures the leverage of the points.
ggplot(data = NULL, aes(x = pace_wins_points, y = pace_wins_hats)) +
geom_point() # Shows the leverage on a graphpace_wins_cook <- cooks.distance(pace_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = pace_wins_points, y = pace_wins_cook))+
geom_point() # shows the collective change through a scatterplot graph. ## lag Autocorrelation D-W Statistic p-value
## 1 0.1024738 1.725845 0.46
## Alternative hypothesis: rho != 0
pace_wins_res <- residuals(pace_wins_lm)
pace_wins_fitted <- predict(pace_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = pace_wins_fitted, y = pace_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = pace_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals.Offence rating v Wins
of_wins <- comb_team %>%
ggplot(aes(ORtg, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Team Offensive Ratings and Wins.
of_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.8411322
of_wins_lm <- lm(W ~ ORtg, data = comb_team)
summary(of_wins_lm) # Creates a linear regression model for Offensive Rating v Wins.##
## Call:
## lm(formula = W ~ ORtg, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.9184 -4.5433 0.0544 5.7203 8.6818
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -330.3378 45.1380 -7.318 5.73e-08 ***
## ORtg 3.3636 0.4087 8.230 5.88e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.621 on 28 degrees of freedom
## Multiple R-squared: 0.7075, Adjusted R-squared: 0.6971
## F-statistic: 67.73 on 1 and 28 DF, p-value: 5.882e-09
## 1 2 3 4 5 6 7 8
## 33.26380 47.05442 38.30915 44.36357 22.16402 31.91837 37.63643 49.74527
## 9 10 11 12 13 14 15 16
## 36.29101 59.49962 58.15419 39.31822 47.72713 32.25473 26.53666 30.57294
## 17 18 19 20 21 22 23 24
## 52.43613 44.36357 44.36357 21.15495 40.66364 35.95465 48.39985 25.86395
## 25 26 27 28 29 30
## 55.46334 41.00000 49.40892 50.08163 42.68178 43.35450
of_wins_stdres <- rstandard(of_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
of_wins_points <- 1:length(of_wins_stdres) # Gives the length of the variable
of_wins_labels <- if_else(abs(of_wins_stdres) >= 1.5, paste(of_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = of_wins_points, y = of_wins_stdres)) +
geom_point() +
geom_text(aes(label = of_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified. of_wins_hats <- hatvalues(of_wins_lm) # Measures the leverage of the points.
ggplot(data = NULL, aes(x = of_wins_points, y = of_wins_hats)) +
geom_point() # Shows the leverage on a graphof_wins_cook <- cooks.distance(of_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = of_wins_points, y = of_wins_cook))+
geom_point()# shows the collective change through a scatterplot graph. ## lag Autocorrelation D-W Statistic p-value
## 1 0.1891721 1.501797 0.17
## Alternative hypothesis: rho != 0
of_wins_res <- residuals(of_wins_lm)
of_wins_fitted <- predict(of_wins_lm)# Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = of_wins_fitted, y = of_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = of_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residualsDefence rating v Wins
def_wins <- comb_team %>%
ggplot(aes(DRtg, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Defensive Rating and Wins. Important to note that the negative linear relationship is actually reversed, as a higher defensive rating is not a good outcome for the team. A higher win rate and lower defensive rating is the ideal outcome.
def_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] -0.7597828
def_wins_lm <- lm(W ~ DRtg, data = comb_team)
summary(def_wins_lm) # Creates a linear regression model for Defensive Rating v Wins##
## Call:
## lm(formula = W ~ DRtg, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.7612 -6.0172 -0.6127 6.1833 13.1944
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 383.8923 55.4715 6.921 1.60e-07 ***
## DRtg -3.1058 0.5023 -6.184 1.12e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.959 on 28 degrees of freedom
## Multiple R-squared: 0.5773, Adjusted R-squared: 0.5622
## F-statistic: 38.24 on 1 and 28 DF, p-value: 1.118e-06
## 1 2 3 4 5 6 7 8
## 30.14000 49.08547 43.18442 34.48814 32.31407 18.64849 40.07861 45.66907
## 9 10 11 12 13 14 15 16
## 44.73733 43.80559 40.07861 53.12303 37.59396 43.80559 45.97966 49.70663
## 17 18 19 20 21 22 23 24
## 57.16059 33.24582 34.17756 30.76117 51.57012 48.15373 42.25268 26.41303
## 25 26 27 28 29 30
## 40.69977 37.59396 38.52570 51.25954 55.60768 30.14000
def_wins_stdres <- rstandard(def_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
def_wins_points <- 1:length(def_wins_stdres) # Gives the length of the variable
def_wins_labels <- if_else(abs(def_wins_stdres) >= 1.5, paste(def_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = def_wins_points, y = def_wins_stdres)) +
geom_point() +
geom_text(aes(label = def_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified. def_wins_hats <- hatvalues(def_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = def_wins_points, y = def_wins_hats)) +
geom_point() # Shows the leverage on a graphdef_wins_cook <- cooks.distance(def_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = def_wins_points, y = def_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.06302277 2.123362 0.738
## Alternative hypothesis: rho != 0
def_wins_res <- residuals(def_wins_lm)
def_wins_fitted <- predict(def_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = def_wins_fitted, y = def_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = def_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals.3-Point Attempt Rating v Wins
x3PAr_wins <- comb_team %>%
ggplot(aes(x3PAr, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between 3-Point Attempt Rate and Wins.
x3PAr_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.2843269
x3PAr_wins_lm <- lm(W ~ x3PAr, data = comb_team)
summary(x3PAr_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ x3PAr, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.2461 -6.4940 -0.6239 11.4337 15.5671
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.59 16.33 0.955 0.348
## x3PAr 70.82 45.13 1.569 0.128
##
## Residual standard error: 11.74 on 28 degrees of freedom
## Multiple R-squared: 0.08084, Adjusted R-squared: 0.04801
## F-statistic: 2.463 on 1 and 28 DF, p-value: 0.1278
## 1 2 3 4 5 6 7 8
## 44.13246 42.57449 44.13246 42.36204 36.48425 39.10447 45.47798 40.23754
## 9 10 11 12 13 14 15 16
## 43.49511 42.78694 52.34720 36.27180 36.48425 39.81264 39.81264 41.65387
## 17 18 19 20 21 22 23 24
## 45.26553 37.90059 38.53794 39.24611 40.16672 41.08734 39.81264 39.31692
## 25 26 27 28 29 30
## 39.60019 38.32549 35.84690 42.43286 43.49511 41.79551
x3PAr_wins_stdres <- rstandard(x3PAr_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x3PAr_wins_points <- 1:length(x3PAr_wins_stdres) # Gives the length of the variable
x3PAr_wins_labels <- if_else(abs(x3PAr_wins_stdres) >= 1.5, paste(x3PAr_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x3PAr_wins_points, y = x3PAr_wins_stdres)) +
geom_point() +
geom_text(aes(label = x3PAr_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified. x3PAr_wins_hats <- hatvalues(x3PAr_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x3PAr_wins_points, y = x3PAr_wins_hats)) +
geom_point()# Shows the leveragex3PAr_wins_cook <- cooks.distance(x3PAr_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x3PAr_wins_points, y = x3PAr_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.009106348 1.897535 0.772
## Alternative hypothesis: rho != 0
x3PAr_wins_res <- residuals(x3PAr_wins_lm)
x3PAr_wins_fitted <- predict(x3PAr_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x3PAr_wins_fitted, y = x3PAr_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x3PAr_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals.True Shooting Percentage v Wins
TSP_wins <- comb_team %>%
ggplot(aes(TSp, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between True Shooting Percentage and Wins.
TSP_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.757103
TSP_wins_lm <- lm(W ~ TSp, data = comb_team)
summary(TSP_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ TSp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.4251 -5.4489 0.4241 5.1456 16.8073
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -297.32 55.19 -5.387 9.63e-06 ***
## TSp 604.62 98.60 6.132 1.28e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.997 on 28 degrees of freedom
## Multiple R-squared: 0.5732, Adjusted R-squared: 0.558
## F-statistic: 37.61 on 1 and 28 DF, p-value: 1.283e-06
## 1 2 3 4 5 6 7 8
## 38.23891 45.49432 38.84353 37.63430 29.77427 29.16965 38.23891 40.05277
## 9 10 11 12 13 14 15 16
## 31.58812 63.02823 53.95897 41.86662 50.33126 37.63430 34.00659 30.37889
## 17 18 19 20 21 22 23 24
## 55.16820 36.42506 43.07585 22.51886 32.19274 35.21583 49.72665 36.42506
## 25 26 27 28 29 30
## 46.09894 37.63430 48.51741 52.74973 48.51741 45.49432
TSP_wins_stdres <- rstandard(TSP_wins_lm) # Gives the length of the variable
TSP_wins_points <- 1:length(TSP_wins_stdres) # Gives the length of the variable
TSP_wins_labels <- if_else(abs(TSP_wins_stdres) >= 1.5, paste(TSP_wins_points), "")# Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = TSP_wins_points, y = TSP_wins_stdres)) +
geom_point() +
geom_text(aes(label = TSP_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed")# Create a graph with a straight line on the y-axis to show the limits where outliers are classified.TSP_wins_hats <- hatvalues(TSP_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = TSP_wins_points, y = TSP_wins_hats)) +
geom_point() # Shows the leverageTSP_wins_cook <- cooks.distance(TSP_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = TSP_wins_points, y = TSP_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.0368267 1.777 0.54
## Alternative hypothesis: rho != 0
TSP_wins_res <- residuals(TSP_wins_lm)
TSP_wins_fitted <- predict(TSP_wins_lm)# Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = TSP_wins_fitted, y = TSP_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = TSP_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Effective Field Goal Percentage
eFGp_wins <- comb_team %>%
ggplot(aes(eFGp, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Effective Field Goal Percentage and Wins.
eFGp_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.7818644
eFGp_wins_lm <- lm(W ~ eFGp, data = comb_team)
summary(eFGp_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ eFGp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.728 -5.514 1.954 4.208 14.272
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -282.36 48.75 -5.792 3.21e-06 ***
## eFGp 616.90 92.96 6.636 3.36e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.632 on 28 degrees of freedom
## Multiple R-squared: 0.6113, Adjusted R-squared: 0.5974
## F-statistic: 44.04 on 1 and 28 DF, p-value: 3.365e-07
## 1 2 3 4 5 6 7 8
## 39.66339 47.06617 38.42959 34.72820 29.17611 27.94231 37.81269 42.74788
## 9 10 11 12 13 14 15 16
## 31.64370 66.19003 52.00136 44.59858 43.98168 42.74788 31.02680 35.34510
## 17 18 19 20 21 22 23 24
## 56.93655 32.87750 43.98168 19.92263 34.72820 37.19579 45.83237 34.72820
## 25 26 27 28 29 30
## 43.36478 40.89718 47.06617 52.61826 49.53377 45.21547
eFGp_wins_stdres <- rstandard(eFGp_wins_lm) # Gives the length of the variable
eFGp_wins_points <- 1:length(eFGp_wins_stdres) # Gives the length of the variable
eFGp_wins_labels <- if_else(abs(eFGp_wins_stdres) >= 1.5, paste(eFGp_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = eFGp_wins_points, y = eFGp_wins_stdres)) +
geom_point() +
geom_text(aes(label = eFGp_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.eFGp_wins_hats <- hatvalues(eFGp_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = eFGp_wins_points, y = eFGp_wins_hats)) +
geom_point() # Shows the leverageeFGp_wins_cook <- cooks.distance(eFGp_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = eFGp_wins_points, y = eFGp_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.09648682 2.016169 0.966
## Alternative hypothesis: rho != 0
eFGp_wins_res <- residuals(eFGp_wins_lm)
eFGp_wins_fitted <- predict(eFGp_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = eFGp_wins_fitted, y = eFGp_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = eFGp_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Defensive Rebound Percentage v Lower Defensive Rating
defrb_drtg <- comb_team %>%
ggplot(aes(DRBp, DRtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Defensive Rebound Percentage and the Defensive Rating.
defrb_drtg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] -0.5883396
defrb_drtg_lm <- lm(DRtg ~ DRBp, data = comb_team)
summary(defrb_drtg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = DRtg ~ DRBp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.7577 -1.5538 0.1272 1.0524 7.1279
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 186.0914 19.6640 9.464 3.2e-10 ***
## DRBp -0.9821 0.2551 -3.850 0.000627 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.422 on 28 degrees of freedom
## Multiple R-squared: 0.3461, Adjusted R-squared: 0.3228
## F-statistic: 14.82 on 1 and 28 DF, p-value: 0.0006273
## 1 2 3 4 5 6 7 8
## 111.0613 110.4721 111.0613 110.3739 110.1775 110.4721 109.9810 109.4900
## 9 10 11 12 13 14 15 16
## 108.8026 110.3739 113.0255 111.2577 111.4541 111.0613 109.8828 109.8828
## 17 18 19 20 21 22 23 24
## 107.2313 112.5344 110.6685 111.3559 109.2936 107.8205 108.9008 114.8914
## 25 26 27 28 29 30
## 109.5882 111.9452 108.1151 110.3739 107.2313 113.3201
defrb_drtg_stdres <- rstandard(defrb_drtg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
defrb_drtg_points <- 1:length(defrb_drtg_stdres) # Gives the length of the variable
defrb_drtg_labels <- if_else(abs(defrb_drtg_stdres) >= 1.5, paste(defrb_drtg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = defrb_drtg_points, y = defrb_drtg_stdres)) +
geom_point() +
geom_text(aes(label = defrb_drtg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.## Has an outlier at point number 6, need to explore this to see it's impact on the data. Doesn't appear to have a high leverage or high influence.
defrb_drtg_hats <- hatvalues(defrb_drtg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = defrb_drtg_points, y = defrb_drtg_hats)) +
geom_point() # Shows the leveragedefrb_drtg_cook <- cooks.distance(defrb_drtg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = defrb_drtg_points, y = defrb_drtg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.233268 1.482338 0.16
## Alternative hypothesis: rho != 0
defrb_drtg_res <- residuals(defrb_drtg_lm)
defrb_drtg_fitted <- predict(defrb_drtg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = defrb_drtg_fitted, y = defrb_drtg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = defrb_drtg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals.Turnover Totals contributing to Losses
tov_loss <- comb_team %>%
ggplot(aes(TOV, L)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Turnover Totals and Losses
tov_loss # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.2471353
tov_loss_lm <- lm(L ~ TOV, data = comb_team)
summary(tov_loss_lm) # Creates a linear regression model.##
## Call:
## lm(formula = L ~ TOV, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.3705 -9.4752 -0.2251 7.5361 24.1344
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.16303 30.33643 0.005 0.996
## TOV 0.03536 0.02620 1.350 0.188
##
## Residual standard error: 11.86 on 28 degrees of freedom
## Multiple R-squared: 0.06108, Adjusted R-squared: 0.02754
## F-statistic: 1.821 on 1 and 28 DF, p-value: 0.188
## 1 2 3 4 5 6 7 8
## 49.56487 37.36470 43.87146 35.56120 41.14852 39.27429 41.43143 39.13284
## 9 10 11 12 13 14 15 16
## 40.29982 41.50215 38.84994 39.84010 42.35086 45.56888 40.72417 42.88130
## 17 18 19 20 21 22 23 24
## 40.37054 38.14268 43.12884 40.86562 40.65344 38.42559 43.41174 45.39206
## 25 26 27 28 29 30
## 40.29982 38.88530 35.24293 40.83026 44.01291 40.97171
tov_loss_stdres <- rstandard(tov_loss_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
tov_loss_points <- 1:length(tov_loss_stdres) # Gives the length of the variable
tov_loss_labels <- if_else(abs(tov_loss_stdres) >= 1.5, paste(tov_loss_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = tov_loss_points, y = tov_loss_stdres)) +
geom_point() +
geom_text(aes(label = tov_loss_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.tov_loss_hats <- hatvalues(tov_loss_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = tov_loss_points, y = tov_loss_hats)) +
geom_point() # Shows the leveragetov_loss_cook <- cooks.distance(tov_loss_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = tov_loss_points, y = tov_loss_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.09567978 1.784956 0.502
## Alternative hypothesis: rho != 0
tov_loss_res <- residuals(tov_loss_lm)
tov_loss_fitted <- predict(tov_loss_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = tov_loss_fitted, y = tov_loss_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = tov_loss_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Net Rating v Wins
net_win <- comb_team %>%
ggplot(aes(NRtg, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Team Net Rating and Wins
net_win # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.9800362
net_wins_lm <- lm(W ~ NRtg, data = comb_team)
summary(net_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ NRtg, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.099 -1.440 0.325 1.697 4.810
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 41.00808 0.44436 92.28 <2e-16 ***
## NRtg 2.42414 0.09294 26.08 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.434 on 28 degrees of freedom
## Multiple R-squared: 0.9605, Adjusted R-squared: 0.9591
## F-statistic: 680.3 on 1 and 28 DF, p-value: < 2.2e-16
## 1 2 3 4 5 6 7 8
## 26.94806 51.67431 40.76567 38.34152 20.64529 17.00907 37.85670 50.94706
## 9 10 11 12 13 14 15 16
## 40.52325 56.52259 52.64396 49.25016 43.18981 36.88704 34.46290 40.28084
## 17 18 19 20 21 22 23 24
## 61.85570 37.37187 38.09911 18.70597 49.00775 42.94739 47.31085 18.70597
## 25 26 27 28 29 30
## 51.18948 38.34152 45.12912 55.55293 53.61362 34.22048
net_wins_stdres <- rstandard(net_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
net_wins_points <- 1:length(net_wins_stdres) # Gives the length of the variable
net_wins_labels <- if_else(abs(net_wins_stdres) >= 1.5, paste(net_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = net_wins_points, y = net_wins_stdres)) +
geom_point() +
geom_text(aes(label = net_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.net_wins_hats <- hatvalues(net_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = net_wins_points, y = net_wins_hats)) +
geom_point() # Shows the leveragenet_wins_cook <- cooks.distance(net_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = net_wins_points, y = net_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.0192226 1.983334 0.952
## Alternative hypothesis: rho != 0
net_wins_res <- residuals(net_wins_lm)
net_wins_fitted <- predict(net_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = net_wins_fitted, y = net_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = net_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Total Points contributing to Offensive Rating
pts_off <- comb_team %>%
ggplot(aes(PTS, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Points and Wins
pts_off # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.6606001
pts_wins_lm <- lm(W ~ PTS, data = comb_team)
summary(pts_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ PTS, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.215 -7.302 1.969 6.859 14.044
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.749e+02 4.641e+01 -3.770 0.000776 ***
## PTS 2.368e-02 5.086e-03 4.656 7.1e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.19 on 28 degrees of freedom
## Multiple R-squared: 0.4364, Adjusted R-squared: 0.4163
## F-statistic: 21.68 on 1 and 28 DF, p-value: 7.098e-05
## 1 2 3 4 5 6 7 8
## 45.14174 43.29465 43.01048 40.09777 28.82580 27.92594 36.45096 39.95569
## 9 10 11 12 13 14 15 16
## 32.92255 53.57203 46.25473 34.79331 48.64647 42.08694 26.10253 30.31768
## 17 18 19 20 21 22 23 24
## 54.42453 43.46041 49.21480 28.11538 47.34403 33.44352 48.71751 33.79873
## 25 26 27 28 29 30
## 47.69924 46.77570 41.87381 47.27299 41.99222 46.46785
pts_wins_stdres <- rstandard(pts_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
pts_wins_points <- 1:length(pts_wins_stdres) # Gives the length of the variable
pts_wins_labels <- if_else(abs(pts_wins_stdres) >= 1.5, paste(pts_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = pts_wins_points, y = pts_wins_stdres)) +
geom_point() +
geom_text(aes(label = pts_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.pts_wins_hats <- hatvalues(pts_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = pts_wins_points, y = net_wins_hats)) +
geom_point() # Shows the leveragepts_wins_cook <- cooks.distance(pts_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = pts_wins_points, y = pts_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.1627497 1.475814 0.15
## Alternative hypothesis: rho != 0
pts_wins_res <- residuals(pts_wins_lm)
pts_wins_fitted <- predict(pts_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = pts_wins_fitted, y = pts_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = pts_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Three points v Wins
x3_wins <- comb_team %>%
ggplot(aes(x3P, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between successful 3 Point attempts and Wins
x3_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.4838773
x3_wins_lm <- lm(W ~ x3P, data = comb_team)
summary(x3_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ x3P, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.8669 -5.9449 -0.9138 10.2949 14.3599
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.94961 15.14840 -0.195 0.84702
## x3P 0.04716 0.01612 2.926 0.00674 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.71 on 28 degrees of freedom
## Multiple R-squared: 0.2341, Adjusted R-squared: 0.2068
## F-statistic: 8.56 on 1 and 28 DF, p-value: 0.006744
## 1 2 3 4 5 6 7 8
## 47.37509 45.72433 46.43180 43.13027 32.18807 36.99886 45.25268 39.64009
## 9 10 11 12 13 14 15 16
## 43.88491 48.31838 59.44925 33.79167 35.77258 36.99886 35.30094 40.81920
## 17 18 19 20 21 22 23 24
## 49.16735 36.05557 36.76304 35.86691 41.00786 41.24368 38.97978 34.31048
## 25 26 27 28 29 30
## 39.68725 40.77204 35.34810 44.92253 43.88491 40.91353
x3_wins_stdres <- rstandard(x3_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x3_wins_points <- 1:length(x3_wins_stdres) # Gives the length of the variable
x3_wins_labels <- if_else(abs(x3_wins_stdres) >= 1.5, paste(x3_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x3_wins_points, y = x3_wins_stdres)) +
geom_point() +
geom_text(aes(label = x3_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x3_wins_hats <- hatvalues(x3_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x3_wins_points, y = x3_wins_hats)) +
geom_point() # Shows the leveragex3_wins_cook <- cooks.distance(x3_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x3_wins_points, y = x3_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.04675096 1.96371 0.882
## Alternative hypothesis: rho != 0
x3_wins_res <- residuals(x3_wins_lm)
x3_wins_fitted <- predict(x3_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x3_wins_fitted, y = x3_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x3_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Two points v Wins
x2_wins <- comb_team %>%
ggplot(aes(x2P, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between successful Two Point Attempts and Wins
x2_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.06870566
x2_wins_lm <- lm(W ~ x2P, data = comb_team)
summary(x2_wins_lm) # Creates a linear regression model.##
## Call:
## lm(formula = W ~ x2P, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.416 -7.210 1.202 9.174 18.939
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 29.701338 31.084806 0.955 0.348
## x2P 0.004636 0.012723 0.364 0.718
##
## Residual standard error: 12.21 on 28 degrees of freedom
## Multiple R-squared: 0.00472, Adjusted R-squared: -0.03083
## F-statistic: 0.1328 on 1 and 28 DF, p-value: 0.7183
## 1 2 3 4 5 6 7 8
## 40.48118 40.91701 40.15199 40.45799 41.38993 40.56000 39.71616 41.45948
## 9 10 11 12 13 14 15 16
## 39.86452 41.40847 38.48749 41.80721 41.58466 41.96022 40.37454 40.47190
## 17 18 19 20 21 22 23 24
## 41.06074 41.69130 42.40068 40.41627 41.59393 40.73155 41.37602 41.28793
## 25 26 27 28 29 30
## 41.59857 41.82112 42.01585 41.03756 40.46263 41.41311
x2_wins_stdres <- rstandard(x2_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x2_wins_points <- 1:length(x2_wins_stdres) # Gives the length of the variable
x2_wins_labels <- if_else(abs(x2_wins_stdres) >= 1.5, paste(x2_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x2_wins_points, y = x2_wins_stdres)) +
geom_point() +
geom_text(aes(label = x2_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x2_wins_hats <- hatvalues(x2_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x2_wins_points, y = x2_wins_hats)) +
geom_point() # Shows the leveragex2_wins_cook <- cooks.distance(x2_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x2_wins_points, y = x2_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.112963 1.721293 0.42
## Alternative hypothesis: rho != 0
x2_wins_res <- residuals(x2_wins_lm)
x2_wins_fitted <- predict(x2_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x2_wins_fitted, y = x2_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x2_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Free Throws v wins
ft_wins <- comb_team %>%
ggplot(aes(FT, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Successful Free Throw Attempts and Wins
ft_wins # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.1297686
##
## Call:
## lm(formula = W ~ FT, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.3841 -8.1118 0.3849 8.5255 18.7619
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24.71592 23.61820 1.046 0.304
## FT 0.01123 0.01622 0.693 0.494
##
## Residual standard error: 12.14 on 28 degrees of freedom
## Multiple R-squared: 0.01684, Adjusted R-squared: -0.01827
## F-statistic: 0.4796 on 1 and 28 DF, p-value: 0.4943
## 1 2 3 4 5 6 7 8
## 40.92362 39.11528 42.18160 41.67616 39.63195 39.78920 42.02435 39.25006
## 9 10 11 12 13 14 15 16
## 40.60913 39.75550 42.48486 39.29499 45.52872 39.72180 41.03594 38.62107
## 17 18 19 20 21 22 23 24
## 41.23812 42.35008 41.13703 41.38413 41.12580 38.54245 44.28197 40.96855
## 25 26 27 28 29 30
## 42.21530 39.92398 40.53050 40.99101 42.01312 41.65370
ft_wins_stdres <- rstandard(ft_wins_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
ft_wins_points <- 1:length(ft_wins_stdres) # Gives the length of the variable
ft_wins_labels <- if_else(abs(ft_wins_stdres) >= 1.5, paste(ft_wins_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = ft_wins_points, y = ft_wins_stdres)) +
geom_point() +
geom_text(aes(label = ft_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.ft_wins_hats <- hatvalues(ft_wins_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = ft_wins_points, y = ft_wins_hats)) +
geom_point() # Shows the leverageft_wins_cook <- cooks.distance(ft_wins_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = ft_wins_points, y = ft_wins_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.1088329 1.72528 0.492
## Alternative hypothesis: rho != 0
ft_wins_res <- residuals(ft_wins_lm)
ft_wins_fitted <- predict(ft_wins_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = ft_wins_fitted, y = ft_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = ft_wins_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Three Points v Offensive Rating
x3p_ortg <- comb_team %>%
ggplot(aes(x3P, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between successful 3 Point Attempts and Offensive Rating
x3p_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.5303433
x3p_ortg_lm <- lm(ORtg ~ x3P, data = comb_team)
summary(x3p_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ x3P, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.4931 -2.1844 0.0529 2.0272 4.6598
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 98.354052 3.669782 26.80 < 2e-16 ***
## x3P 0.012927 0.003905 3.31 0.00257 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.595 on 28 degrees of freedom
## Multiple R-squared: 0.2813, Adjusted R-squared: 0.2556
## F-statistic: 10.96 on 1 and 28 DF, p-value: 0.002573
## 1 2 3 4 5 6 7 8
## 112.1473 111.6949 111.8888 110.9839 107.9848 109.3033 111.5656 110.0273
## 9 10 11 12 13 14 15 16
## 111.1907 112.4059 115.4567 108.4243 108.9672 109.3033 108.8380 110.3504
## 17 18 19 20 21 22 23 24
## 112.6386 109.0448 109.2387 108.9931 110.4022 110.4668 109.8463 108.5665
## 25 26 27 28 29 30
## 110.0402 110.3375 108.8509 111.4751 111.1907 110.3763
x3p_ortg_stdres <- rstandard(x3p_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x3p_ortg_points <- 1:length(x3p_ortg_stdres) # Gives the length of the variable
x3p_ortg_labels <- if_else(abs(x3p_ortg_stdres) >= 1.5, paste(x3p_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x3p_ortg_points, y = x3p_ortg_stdres)) +
geom_point() +
geom_text(aes(label = x3p_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x3p_ortg_hats <- hatvalues(x3p_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x3p_ortg_points, y = x3p_ortg_hats)) +
geom_point() # Shows the leveragex3p_ortg_cook <- cooks.distance(x3p_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x3p_ortg_points, y = x3p_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.1451083 2.200583 0.612
## Alternative hypothesis: rho != 0
x3p_ortg_res <- residuals(x3p_ortg_lm)
x3p_ortg_fitted <- predict(x3p_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x3p_ortg_fitted, y = x3p_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x3p_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Two Points v Offensive Rating
x2p_ortg <- comb_team %>%
ggplot(aes(x2P, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between successful 2 Point Attempts and Offensive Rating
x2p_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.09895804
x2p_ortg_lm <- lm(ORtg ~ x2P, data = comb_team)
summary(x2p_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ x2P, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.7404 -1.9357 0.0999 2.0059 6.0050
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.063e+02 7.754e+00 13.714 6e-14 ***
## x2P 1.670e-03 3.174e-03 0.526 0.603
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.046 on 28 degrees of freedom
## Multiple R-squared: 0.009793, Adjusted R-squared: -0.02557
## F-statistic: 0.2769 on 1 and 28 DF, p-value: 0.6029
## 1 2 3 4 5 6 7 8
## 110.2131 110.3701 110.0946 110.2048 110.5404 110.2415 109.9376 110.5655
## 9 10 11 12 13 14 15 16
## 109.9910 110.5471 109.4950 110.6907 110.6106 110.7459 110.1747 110.2098
## 17 18 19 20 21 22 23 24
## 110.4219 110.6490 110.9045 110.1897 110.6139 110.3033 110.5354 110.5037
## 25 26 27 28 29 30
## 110.6156 110.6958 110.7659 110.4135 110.2064 110.5488
x2p_ortg_stdres <- rstandard(x2p_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x2p_ortg_points <- 1:length(x2p_ortg_stdres) # Gives the length of the variable
x2p_ortg_labels <- if_else(abs(x2p_ortg_stdres) >= 1.5, paste(x2p_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x2p_ortg_points, y = x2p_ortg_stdres)) +
geom_point() +
geom_text(aes(label = x2p_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x2p_ortg_hats <- hatvalues(x2p_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x2p_ortg_points, y = x2p_ortg_hats)) +
geom_point() # Shows the leveragex2p_ortg_cook <- cooks.distance(x2p_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x2p_ortg_points, y = x2p_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.0257704 1.930104 0.876
## Alternative hypothesis: rho != 0
x2p_ortg_res <- residuals(x2p_ortg_lm)
x2p_ortg_fitted <- predict(x2p_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x2p_ortg_fitted, y = x2p_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x2p_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Free Throws v Offensive Rating
ft_ortg <- comb_team %>%
ggplot(aes(FT, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between successful Free Throws and Offensive Rating
ft_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.291237
ft_ortg_lm <- lm(ORtg ~ FT, data = comb_team)
summary(ft_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ FT, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.1156 -1.7174 0.0868 2.2596 6.1985
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.013e+02 5.698e+00 17.770 <2e-16 ***
## FT 6.304e-03 3.913e-03 1.611 0.118
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.929 on 28 degrees of freedom
## Multiple R-squared: 0.08482, Adjusted R-squared: 0.05213
## F-statistic: 2.595 on 1 and 28 DF, p-value: 0.1184
## 1 2 3 4 5 6 7 8
## 110.3571 109.3422 111.0632 110.7795 109.6322 109.7205 110.9749 109.4179
## 9 10 11 12 13 14 15 16
## 110.1806 109.7015 111.2334 109.4431 112.9417 109.6826 110.4202 109.0649
## 17 18 19 20 21 22 23 24
## 110.5336 111.1577 110.4769 110.6156 110.4706 109.0207 112.2419 110.3823
## 25 26 27 28 29 30
## 111.0821 109.7961 110.1365 110.3950 110.9686 110.7669
ft_ortg_stdres <- rstandard(ft_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
ft_ortg_points <- 1:length(ft_ortg_stdres) # Gives the length of the variable
ft_ortg_labels <- if_else(abs(ft_ortg_stdres) >= 1.5, paste(ft_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = ft_ortg_points, y = ft_ortg_stdres)) +
geom_point() +
geom_text(aes(label = ft_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.ft_ortg_hats <- hatvalues(ft_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = ft_ortg_points, y = ft_ortg_hats)) +
geom_point() # Shows the leverageft_ortg_cook <- cooks.distance(ft_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = ft_ortg_points, y = ft_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.04193079 1.894461 0.816
## Alternative hypothesis: rho != 0
ft_ortg_res <- residuals(ft_ortg_lm)
ft_ortg_fitted <- predict(ft_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = ft_ortg_fitted, y = ft_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = ft_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Three Point Percentage v Wins
x3pp_w <- comb_team %>%
ggplot(aes(x3Pp, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between 3 Point Shot Percentage and Wins
x3pp_w # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.5418599
##
## Call:
## lm(formula = W ~ x3Pp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.7873 -5.8964 0.3398 7.7762 20.0635
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -110.23 44.37 -2.484 0.01923 *
## x3Pp 425.41 124.70 3.411 0.00198 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.29 on 28 degrees of freedom
## Multiple R-squared: 0.2936, Adjusted R-squared: 0.2684
## F-statistic: 11.64 on 1 and 28 DF, p-value: 0.001983
## 1 2 3 4 5 6 7 8
## 39.51105 45.04142 39.93647 39.08564 39.08564 40.78729 34.40610 39.08564
## 9 10 11 12 13 14 15 16
## 37.80940 53.54969 41.21271 48.87014 54.82593 31.42820 35.25692 38.23481
## 17 18 19 20 21 22 23 24
## 39.93647 39.08564 36.10775 34.40610 37.80940 41.21271 42.48895 29.72655
## 25 26 27 28 29 30
## 42.48895 50.57180 56.52758 45.46684 41.21271 34.83151
x3pp_w_stdres <- rstandard(x3pp_w_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x3pp_w_points <- 1:length(x3pp_w_stdres) # Gives the length of the variable
x3pp_w_labels <- if_else(abs(x3pp_w_stdres) >= 1.5, paste(x3pp_w_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x3pp_w_points, y = x3pp_w_stdres)) +
geom_point() +
geom_text(aes(label = x3pp_w_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x3pp_w_hats <- hatvalues(x3pp_w_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x3pp_w_points, y = x3pp_w_hats)) +
geom_point() # Shows the leveragex3pp_w_cook <- cooks.distance(x3pp_w_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x3pp_w_points, y = x3pp_w_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.006206046 1.972432 0.942
## Alternative hypothesis: rho != 0
x3pp_w_res <- residuals(x3pp_w_lm)
x3pp_w_fitted <- predict(x3pp_w_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x3pp_w_fitted, y = x3pp_w_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x3pp_w_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Three Point Percentage v Offensive Rating
x3pp_ortg <- comb_team %>%
ggplot(aes(x3Pp, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between 3 Point Shot Percentage and Offensive Rating
x3pp_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.5532274
x3pp_ortg_lm <- lm(ORtg ~ x3Pp, data = comb_team)
summary(x3pp_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ x3Pp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.1112 -1.8453 0.1448 1.7548 5.0457
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 71.79 11.00 6.528 4.48e-07 ***
## x3Pp 108.62 30.91 3.514 0.00152 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.55 on 28 degrees of freedom
## Multiple R-squared: 0.3061, Adjusted R-squared: 0.2813
## F-statistic: 12.35 on 1 and 28 DF, p-value: 0.001519
## 1 2 3 4 5 6 7 8
## 110.0198 111.4318 110.1285 109.9112 109.9112 110.3457 108.7165 109.9112
## 9 10 11 12 13 14 15 16
## 109.5854 113.6042 110.4543 112.4094 113.9300 107.9562 108.9337 109.6940
## 17 18 19 20 21 22 23 24
## 110.1285 109.9112 109.1509 108.7165 109.5854 110.4543 110.7802 107.5217
## 25 26 27 28 29 30
## 110.7802 112.8438 114.3645 111.5405 110.4543 108.8251
x3pp_ortg_stdres <- rstandard(x3pp_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x3pp_ortg_points <- 1:length(x3pp_ortg_stdres) # Gives the length of the variable
x3pp_ortg_labels <- if_else(abs(x3pp_ortg_stdres) >= 1.5, paste(x3pp_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x3pp_ortg_points, y = x3pp_ortg_stdres)) +
geom_point() +
geom_text(aes(label = x3pp_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x3pp_ortg_hats <- hatvalues(x3pp_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x3pp_ortg_points, y = x3pp_ortg_hats)) +
geom_point() # Shows the leveragex3pp_ortg_cook <- cooks.distance(x3pp_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x3pp_ortg_points, y = x3pp_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.1190102 2.189357 0.662
## Alternative hypothesis: rho != 0
x3pp_ortg_res <- residuals(x3pp_ortg_lm)
x3pp_ortg_fitted <- predict(x3pp_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x3pp_ortg_fitted, y = x3pp_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x3pp_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Two Point Percentage v Wins
x2pp_w <- comb_team %>%
ggplot(aes(x2Pp, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between 2 Point Shot Percentage and Wins
x2pp_w # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.6009396
##
## Call:
## lm(formula = W ~ x2Pp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.707 -9.320 2.787 7.149 11.738
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -145.27 46.86 -3.100 0.004375 **
## x2Pp 358.06 90.00 3.978 0.000445 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.785 on 28 degrees of freedom
## Multiple R-squared: 0.3611, Adjusted R-squared: 0.3383
## F-statistic: 15.83 on 1 and 28 DF, p-value: 0.0004453
## 1 2 3 4 5 6 7 8
## 40.20034 43.42285 38.41006 36.26172 32.32310 29.45865 42.34868 43.42285
## 9 10 11 12 13 14 15 16
## 33.75533 54.16454 52.01620 39.84228 36.26172 48.43564 35.54561 36.97783
## 17 18 19 20 21 22 23 24
## 57.02899 35.18755 46.64536 26.23614 37.33589 36.97783 44.13896 42.70674
## 25 26 27 28 29 30
## 41.99062 35.18755 38.41006 47.71953 48.43564 49.15175
x2pp_w_stdres <- rstandard(x2pp_w_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x2pp_w_points <- 1:length(x2pp_w_stdres) # Gives the length of the variable
x2pp_w_labels <- if_else(abs(x2pp_w_stdres) >= 1.5, paste(x2pp_w_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x2pp_w_points, y = x2pp_w_stdres)) +
geom_point() +
geom_text(aes(label = x2pp_w_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x3pp_w_hats <- hatvalues(x3pp_w_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x3pp_w_points, y = x3pp_w_hats)) +
geom_point() # Shows the leveragex3pp_w_cook <- cooks.distance(x3pp_w_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x3pp_w_points, y = x3pp_w_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.006206046 1.972432 0.904
## Alternative hypothesis: rho != 0
x3pp_w_res <- residuals(x3pp_w_lm)
x3pp_w_fitted <- predict(x3pp_w_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x3pp_w_fitted, y = x3pp_w_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x3pp_w_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Two Point Percentage v Offensive Rating
x2pp_ortg <- comb_team %>%
ggplot(aes(x2Pp, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between 2 Point Shot Percentage and Offensive Rating
x2pp_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.5532274
x2pp_ortg_lm <- lm(ORtg ~ x2Pp, data = comb_team)
summary(x2pp_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ x2Pp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.9585 -1.4956 0.1481 1.8522 4.0339
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 60.36 11.20 5.391 9.52e-06 ***
## x2Pp 96.19 21.50 4.473 0.000117 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.338 on 28 degrees of freedom
## Multiple R-squared: 0.4168, Adjusted R-squared: 0.3959
## F-statistic: 20.01 on 1 and 28 DF, p-value: 0.0001169
## 1 2 3 4 5 6 7 8
## 110.1852 111.0509 109.7042 109.1271 108.0690 107.2995 110.7623 111.0509
## 9 10 11 12 13 14 15 16
## 108.4538 113.9366 113.3594 110.0890 109.1271 112.3975 108.9347 109.3195
## 17 18 19 20 21 22 23 24
## 114.7061 108.8385 111.9166 106.4338 109.4157 109.3195 111.2433 110.8585
## 25 26 27 28 29 30
## 110.6661 108.8385 109.7042 112.2052 112.3975 112.5899
x2pp_ortg_stdres <- rstandard(x2pp_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
x2pp_ortg_points <- 1:length(x2pp_ortg_stdres) # Gives the length of the variable
x2pp_ortg_labels <- if_else(abs(x2pp_ortg_stdres) >= 1.5, paste(x2pp_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = x2pp_ortg_points, y = x2pp_ortg_stdres)) +
geom_point() +
geom_text(aes(label = x2pp_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.x2pp_ortg_hats <- hatvalues(x2pp_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = x2pp_ortg_points, y = x2pp_ortg_hats)) +
geom_point() # Shows the leveragex2pp_ortg_cook <- cooks.distance(x2pp_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = x2pp_ortg_points, y = x2pp_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.1360498 2.229185 0.556
## Alternative hypothesis: rho != 0
x2pp_ortg_res <- residuals(x2pp_ortg_lm)
x2pp_ortg_fitted <- predict(x2pp_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = x2pp_ortg_fitted, y = x2pp_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = x2pp_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Free Throw Percentage v Wins
ftp_w <- comb_team %>%
ggplot(aes(FTp, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Free Throw Percentage and Wins
ftp_w # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.1611608
##
## Call:
## lm(formula = W ~ FTp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.520 -7.355 1.762 9.610 18.637
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.18 53.49 -0.097 0.924
## FTp 60.21 69.68 0.864 0.395
##
## Residual standard error: 12.08 on 28 degrees of freedom
## Multiple R-squared: 0.02597, Adjusted R-squared: -0.008814
## F-statistic: 0.7466 on 1 and 28 DF, p-value: 0.3949
## 1 2 3 4 5 6 7 8
## 40.09884 43.10939 39.67737 42.80834 41.96538 42.50728 39.49673 40.27948
## 9 10 11 12 13 14 15 16
## 39.79779 43.04918 42.44707 40.09884 42.50728 36.90766 41.30306 36.66682
## 17 18 19 20 21 22 23 24
## 41.36327 42.20623 40.64074 40.52032 37.75061 41.90517 41.24285 41.72454
## 25 26 27 28 29 30
## 43.83192 38.53336 44.13298 43.22981 39.13547 41.06222
ftp_w_stdres <- rstandard(ftp_w_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
ftp_w_points <- 1:length(ftp_w_stdres) # Gives the length of the variable
ftp_w_labels <- if_else(abs(ftp_w_stdres) >= 1.5, paste(ftp_w_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = ftp_w_points, y = ftp_w_stdres)) +
geom_point() +
geom_text(aes(label = ftp_w_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.ftp_w_hats <- hatvalues(ftp_w_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = ftp_w_points, y = ftp_w_hats)) +
geom_point() # Shows the leverageftp_w_cook <- cooks.distance(ftp_w_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = ftp_w_points, y = ftp_w_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.1036315 1.742503 0.448
## Alternative hypothesis: rho != 0
ftp_w_res <- residuals(ftp_w_lm)
ftp_w_fitted <- predict(ftp_w_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = ftp_w_fitted, y = ftp_w_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = ftp_w_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Free Throw Percentage v Offensive Rating
ft_ortg <- comb_team %>%
ggplot(aes(FTp, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Free Throw Percentage and Offensive Rating
ft_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.4243103
ft_ortg_lm <- lm(ORtg ~ FTp, data = comb_team)
summary(ft_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ FTp, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.2356 -0.5181 0.3085 1.7017 4.1508
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 80.00 12.27 6.518 4.59e-07 ***
## FTp 39.64 15.99 2.480 0.0194 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.772 on 28 degrees of freedom
## Multiple R-squared: 0.18, Adjusted R-squared: 0.1508
## F-statistic: 6.148 on 1 and 28 DF, p-value: 0.01944
## 1 2 3 4 5 6 7 8
## 109.8067 111.7888 109.5292 111.5906 111.0356 111.3924 109.4103 109.9256
## 9 10 11 12 13 14 15 16
## 109.6085 111.7492 111.3527 109.8067 111.3924 107.7056 110.5995 107.5470
## 17 18 19 20 21 22 23 24
## 110.6392 111.1942 110.1635 110.0842 108.2606 110.9960 110.5599 110.8770
## 25 26 27 28 29 30
## 112.2645 108.7760 112.4627 111.8681 109.1724 110.4410
ft_ortg_stdres <- rstandard(ft_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
ft_ortg_points <- 1:length(ft_ortg_stdres) # Gives the length of the variable
ft_ortg_labels <- if_else(abs(ft_ortg_stdres) >= 1.5, paste(ft_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = ft_ortg_points, y = ft_ortg_stdres)) +
geom_point() +
geom_text(aes(label = ft_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.ft_ortg_hats <- hatvalues(ft_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = ft_ortg_points, y = ft_ortg_hats)) +
geom_point() # Shows the leverageft_ortg_cook <- cooks.distance(ft_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = ft_ortg_points, y = ft_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.01352301 2.01149 0.94
## Alternative hypothesis: rho != 0
ft_ortg_res <- residuals(ft_ortg_lm)
ft_ortg_fitted <- predict(ft_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = ft_ortg_fitted, y = ft_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = ft_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Offensive Rebounds v Offensive Rating
orb_ortg <- comb_team %>%
ggplot(aes(ORB, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Offensive Rebounds and Offensive Rating
orb_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.117081
orb_ortg_lm <- lm(ORtg ~ ORB, data = comb_team)
summary(orb_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ ORB, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.9373 -2.3514 0.2509 2.0480 5.7247
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.067e+02 5.964e+00 17.891 <2e-16 ***
## ORB 4.366e-03 6.998e-03 0.624 0.538
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.04 on 28 degrees of freedom
## Multiple R-squared: 0.01371, Adjusted R-squared: -0.02152
## F-statistic: 0.3892 on 1 and 28 DF, p-value: 0.5378
## 1 2 3 4 5 6 7 8
## 110.8651 110.2059 110.6250 110.2495 109.8304 110.5333 110.3281 110.9393
## 9 10 11 12 13 14 15 16
## 110.7822 110.1753 110.3456 110.0225 110.1709 110.3412 109.8522 110.7167
## 17 18 19 20 21 22 23 24
## 110.0225 110.7254 110.6643 110.4373 111.1969 110.2845 110.5901 109.9614
## 25 26 27 28 29 30
## 110.9175 110.6512 110.0007 110.1273 110.2757 110.1622
orb_ortg_stdres <- rstandard(orb_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
orb_ortg_points <- 1:length(orb_ortg_stdres) # Gives the length of the variable
orb_ortg_labels <- if_else(abs(orb_ortg_stdres) >= 1.5, paste(orb_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = orb_ortg_points, y = orb_ortg_stdres)) +
geom_point() +
geom_text(aes(label = orb_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.orb_ortg_hats <- hatvalues(orb_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = orb_ortg_points, y = orb_ortg_hats)) +
geom_point() # Shows the leverageorb_ortg_cook <- cooks.distance(orb_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = orb_ortg_points, y = orb_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.02811017 1.910839 0.794
## Alternative hypothesis: rho != 0
orb_ortg_res <- residuals(orb_ortg_lm)
orb_ortg_fitted <- predict(orb_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = orb_ortg_fitted, y = orb_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = orb_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Defensive Rebounds v Defensive Rating
drb_drtg <- comb_team %>%
ggplot(aes(DRB, DRtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Defensive Rebounds and Defensive Ratings
drb_drtg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] -0.5825676
drb_drtg_lm <- lm(DRtg ~ DRB, data = comb_team)
summary(drb_drtg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = DRtg ~ DRB, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.9317 -2.4715 0.6419 1.4557 4.4717
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 143.355591 8.699642 16.478 6.08e-16 ***
## DRB -0.011542 0.003043 -3.793 0.00073 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.434 on 28 degrees of freedom
## Multiple R-squared: 0.3394, Adjusted R-squared: 0.3158
## F-statistic: 14.38 on 1 and 28 DF, p-value: 0.0007305
## 1 2 3 4 5 6 7 8
## 110.7507 110.4737 109.6658 111.2932 111.0508 113.1283 110.0698 110.6699
## 9 10 11 12 13 14 15 16
## 111.5933 108.8464 113.1975 111.4317 109.4696 108.9041 112.1588 110.1275
## 17 18 19 20 21 22 23 24
## 105.0838 111.6163 109.0888 110.9123 109.7582 109.8620 108.4424 113.7746
## 25 26 27 28 29 30
## 109.1003 110.8200 109.7697 109.6543 108.9503 112.4358
drb_drtg_stdres <- rstandard(drb_drtg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
drb_drtg_points <- 1:length(drb_drtg_stdres) # Gives the length of the variable
drb_drtg_labels <- if_else(abs(drb_drtg_stdres) >= 1.5, paste(drb_drtg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = drb_drtg_points, y = drb_drtg_stdres)) +
geom_point() +
geom_text(aes(label = drb_drtg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.drb_drtg_hats <- hatvalues(drb_drtg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = drb_drtg_points, y = drb_drtg_hats)) +
geom_point() # Shows the leveragedrb_drtg_cook <- cooks.distance(drb_drtg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = drb_drtg_points, y = drb_drtg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.1855995 1.55609 0.242
## Alternative hypothesis: rho != 0
drb_drtg_res <- residuals(drb_drtg_lm)
drb_drtg_fitted <- predict(drb_drtg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = drb_drtg_fitted, y = drb_drtg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = drb_drtg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Steals v Defensive Rating
stl_drtg <- comb_team %>%
ggplot(aes(STL, DRtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Steals and Defensive Ratings
stl_drtg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] -0.2123928
stl_drtg_lm <- lm(DRtg ~ STL, data = comb_team)
summary(stl_drtg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = DRtg ~ STL, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.3024 -1.8397 -0.6048 1.9845 6.3658
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 116.058784 4.946220 23.46 <2e-16 ***
## STL -0.009035 0.007855 -1.15 0.26
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.926 on 28 degrees of freedom
## Multiple R-squared: 0.04511, Adjusted R-squared: 0.01101
## F-statistic: 1.323 on 1 and 28 DF, p-value: 0.2598
## 1 2 3 4 5 6 7 8
## 109.9603 109.6803 111.1891 110.7192 110.6108 111.2342 111.2433 110.3308
## 9 10 11 12 13 14 15 16
## 110.9180 110.4121 109.7345 109.6170 110.9903 110.4753 109.8790 110.3940
## 17 18 19 20 21 22 23 24
## 110.5024 109.8881 110.5476 111.0264 109.1382 111.1529 110.5837 109.4182
## 25 26 27 28 29 30
## 111.1258 109.9242 111.5324 109.9152 110.0687 109.8881
stl_drtg_stdres <- rstandard(stl_drtg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
stl_drtg_points <- 1:length(stl_drtg_stdres) # Gives the length of the variable
stl_drtg_labels <- if_else(abs(stl_drtg_stdres) >= 1.5, paste(stl_drtg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = stl_drtg_points, y = stl_drtg_stdres)) +
geom_point() +
geom_text(aes(label = stl_drtg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.stl_drtg_hats <- hatvalues(stl_drtg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = stl_drtg_points, y = stl_drtg_hats)) +
geom_point() # Shows the leveragestl_drtg_cook <- cooks.distance(stl_drtg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = stl_drtg_points, y = stl_drtg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.05713453 1.753875 0.522
## Alternative hypothesis: rho != 0
stl_drtg_res <- residuals(stl_drtg_lm)
stl_drtg_fitted <- predict(stl_drtg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = stl_drtg_fitted, y = stl_drtg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = stl_drtg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Steals v Offensive Rating
stl_ortg <- comb_team %>%
ggplot(aes(STL, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Steals and Offensive Rating
stl_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.02283556
stl_ortg_lm <- lm(ORtg ~ STL, data = comb_team)
summary(stl_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ STL, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.8315 -2.1159 0.2053 2.1810 5.5010
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.098e+02 5.173e+00 21.222 <2e-16 ***
## STL 9.930e-04 8.216e-03 0.121 0.905
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.06 on 28 degrees of freedom
## Multiple R-squared: 0.0005215, Adjusted R-squared: -0.03517
## F-statistic: 0.01461 on 1 and 28 DF, p-value: 0.9047
## 1 2 3 4 5 6 7 8
## 110.4487 110.4795 110.3136 110.3653 110.3772 110.3087 110.3077 110.4080
## 9 10 11 12 13 14 15 16
## 110.3434 110.3990 110.4735 110.4864 110.3355 110.3921 110.4576 110.4010
## 17 18 19 20 21 22 23 24
## 110.3891 110.4566 110.3841 110.3315 110.5390 110.3176 110.3802 110.5083
## 25 26 27 28 29 30
## 110.3206 110.4527 110.2759 110.4537 110.4368 110.4566
stl_ortg_stdres <- rstandard(stl_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
stl_ortg_points <- 1:length(stl_ortg_stdres) # Gives the length of the variable
stl_ortg_labels <- if_else(abs(stl_ortg_stdres) >= 1.5, paste(stl_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = stl_ortg_points, y = stl_ortg_stdres)) +
geom_point() +
geom_text(aes(label = stl_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.stl_ortg_hats <- hatvalues(stl_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = stl_ortg_points, y = stl_ortg_hats)) +
geom_point() # Shows the leveragestl_ortg_cook <- cooks.distance(stl_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = stl_ortg_points, y = stl_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.004529181 1.986446 0.942
## Alternative hypothesis: rho != 0
stl_ortg_res <- residuals(stl_ortg_lm)
stl_ortg_fitted <- predict(stl_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = stl_ortg_fitted, y = stl_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = stl_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Assists v Offensive Rating
ast_ortg <- comb_team %>%
ggplot(aes(AST, ORtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Assists and Offensive Rating
ast_ortg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] 0.459058
ast_ortg_lm <- lm(ORtg ~ AST, data = comb_team)
summary(ast_ortg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = ORtg ~ AST, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.0234 -1.9528 -0.1719 1.6681 7.3184
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 94.142343 5.966670 15.778 1.83e-15 ***
## AST 0.008064 0.002949 2.734 0.0107 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.72 on 28 degrees of freedom
## Multiple R-squared: 0.2107, Adjusted R-squared: 0.1825
## F-statistic: 7.476 on 1 and 28 DF, p-value: 0.01072
## 1 2 3 4 5 6 7 8
## 111.2217 111.5201 109.8992 109.5041 108.6251 107.8349 109.6089 112.2458
## 9 10 11 12 13 14 15 16
## 109.0203 113.6006 108.1816 111.3024 110.0283 111.0443 109.9718 110.1976
## 17 18 19 20 21 22 23 24
## 111.3669 110.4153 112.0120 107.4155 109.6009 111.0362 111.9394 109.9234
## 25 26 27 28 29 30
## 109.3589 110.9395 110.3750 110.9556 111.3427 111.5120
ast_ortg_stdres <- rstandard(ast_ortg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
ast_ortg_points <- 1:length(ast_ortg_stdres) # Gives the length of the variable
ast_ortg_labels <- if_else(abs(ast_ortg_stdres) >= 1.5, paste(ast_ortg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = ast_ortg_points, y = ast_ortg_stdres)) +
geom_point() +
geom_text(aes(label = ast_ortg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.# Point 11 has a std res value of just over 2.8, may be considered an outlier or a high leverage high influence point.
ast_ortg_hats <- hatvalues(ast_ortg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = ast_ortg_points, y = ast_ortg_hats)) +
geom_point() # Shows the leverageast_ortg_cook <- cooks.distance(ast_ortg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = ast_ortg_points, y = ast_ortg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 -0.109389 2.170904 0.678
## Alternative hypothesis: rho != 0
ast_ortg_res <- residuals(ast_ortg_lm)
ast_ortg_fitted <- predict(ast_ortg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = ast_ortg_fitted, y = ast_ortg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = ast_ortg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. Blocks v Defensive Rating
blk_drtg <- comb_team %>%
ggplot(aes(BLK, DRtg)) +
geom_point() +
geom_smooth(method = "lm", colour = "magenta") # Testing for a linear relationship between Blocks and Defensive Rating
blk_drtg # Prints the above graph## `geom_smooth()` using formula 'y ~ x'
## [1] -0.561718
blk_drtg_lm <- lm(DRtg ~ BLK, data = comb_team)
summary(blk_drtg_lm) # Creates a linear regression model.##
## Call:
## lm(formula = DRtg ~ BLK, data = comb_team)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.9633 -2.2885 0.1359 1.8858 5.0243
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 121.648806 3.162619 38.465 < 2e-16 ***
## BLK -0.027687 0.007706 -3.593 0.00124 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.478 on 28 degrees of freedom
## Multiple R-squared: 0.3155, Adjusted R-squared: 0.2911
## F-statistic: 12.91 on 1 and 28 DF, p-value: 0.001238
## 1 2 3 4 5 6 7 8
## 110.0480 109.6050 112.2630 110.4356 111.9307 116.2499 111.9307 111.5985
## 9 10 11 12 13 14 15 16
## 112.4845 107.1132 110.4356 110.4633 110.9894 109.4666 109.2451 109.2451
## 17 18 19 20 21 22 23 24
## 108.1930 110.2695 109.4389 109.9650 109.8819 109.3282 109.6881 110.0757
## 25 26 27 28 29 30
## 110.2141 111.5985 110.9617 109.5497 108.2761 111.1555
blk_drtg_stdres <- rstandard(blk_drtg_lm) # Calculating the standardised residuals, which is the residual divided by their standard deviation
blk_drtg_points <- 1:length(blk_drtg_stdres) # Gives the length of the variable
blk_drtg_labels <- if_else(abs(blk_drtg_stdres) >= 1.5, paste(blk_drtg_points), "") # Will label the point on the graph if the residual is greater than 1.5 standard deviations.
ggplot(data = NULL, aes(x = blk_drtg_points, y = blk_drtg_stdres)) +
geom_point() +
geom_text(aes(label = blk_drtg_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "red", linetype = "dashed") # Create a graph with a straight line on the y-axis to show the limits where outliers are classified.blk_drtg_hats <- hatvalues(blk_drtg_lm) # Measures the leverage of the points
ggplot(data = NULL, aes(x = blk_drtg_points, y = blk_drtg_hats)) +
geom_point() # Shows the leverageblk_drtg_cook <- cooks.distance(blk_drtg_lm) # Collective change in the coefficients when the ith point is deleted
ggplot(data = NULL, aes(x = blk_drtg_points, y = blk_drtg_cook))+
geom_point() # Shows the collective change through a scatterplot graph.## lag Autocorrelation D-W Statistic p-value
## 1 0.0515884 1.766672 0.44
## Alternative hypothesis: rho != 0
blk_drtg_res <- residuals(blk_drtg_lm)
blk_drtg_fitted <- predict(blk_drtg_lm) # Testing for homoscedasticity, and whether they have a constant variance across all x values.
ggplot(data = NULL, aes(x = blk_drtg_fitted, y = blk_drtg_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") # Graphs the results of homoscedasticityggplot(data = NULL, aes(sample = blk_drtg_res)) +
stat_qq() + stat_qq_line() # Normality of the residuals. comb_team %>%
mutate(z_pts = round((PTS - mean(PTS)) / sd(PTS)),
pts_per_game = PTS / G,
z_x3p = round((x3P - mean(x3P)) / sd(x3P))) %>%
ggplot() +
stat_qq(aes(sample = pts_per_game)) +
facet_wrap(~ z_x3p) # The Z-Score is used to standardise and compare individual performances to the mean performances across a group. It allows us to adjust performances based on a given mean and standard deviation. # Plotting the z-score on a graph (stat_qq) tests of the normality to ensure there is no evidence of heteroscedasticity.
comb_team %>%
mutate(z_pts = round((PTS - mean(PTS)) / sd(PTS)),
pts_per_game = PTS / G) %>%
ggplot() +
stat_qq(aes(sample = pts_per_game)) +
facet_wrap(~ z_pts) # The Z-Score is used to standardise and compare individual performances to the mean performances across a group. It allows us to adjust performances based on a given mean and standard deviation. comb_team %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
summarize(avg_3p = mean(x3ppg),
s_3p = sd(x3ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x3ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 3 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the 3 point shots made.## avg_3p s_3p avg_ptspg s_ptspg r
## 1 11.36382 1.504955 111.2085 4.092142 0.4565562
comb_team %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
summarize(avg_2p = mean(x2ppg),
s_2p = sd(x2ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x2ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 2 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the 2 point shots made.## avg_2p s_2p avg_ptspg s_ptspg r
## 1 29.71829 2.173743 111.2085 4.092142 0.3105844
comb_team %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
summarize(avg_ft = mean(ftppg),
s_ft = sd(ftppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(ftppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Free Throw Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the Free Throw shots made.## avg_ft s_ft avg_ptspg s_ptspg r
## 1 17.68049 1.694804 111.2085 4.092142 0.4015758
comb_team %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
summarize(avg_ast = mean(astpg),
s_ast = sd(astpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(astpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Assists Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the Assists made.## avg_ast s_ast avg_ptspg s_ptspg r
## 1 24.58659 2.08829 111.2085 4.092142 0.5653053
comb_team %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
summarize(avg_stl = mean(stlpg),
s_stl = sd(stlpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(stlpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Steals Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the Steals made.## avg_stl s_stl avg_ptspg s_ptspg r
## 1 7.63374 0.8436119 111.2085 4.092142 0.171338
comb_team %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
summarize(avg_orb = mean(orbpg),
s_orb = sd(orbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(orbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Offensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the Offensive Rebounds made.## avg_orb s_orb avg_ptspg s_ptspg r
## 1 10.34715 0.9837691 111.2085 4.092142 0.1915307
comb_team %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
summarize(avg_drb = mean(drbpg),
s_drb = sd(drbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(drbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Defensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a team may make as well as the Defensive Rebounds made.## avg_drb s_drb avg_ptspg s_ptspg r
## 1 34.81829 1.811346 111.2085 4.092142 0.5642984
New Data Frames for Specific Position
centres <- subset(p_stats, Pos == "C") # Creates a data frame specific for Centres
sg <- subset(p_stats, Pos == "SG") # Creates a data frame specific for Shooting Guards
sf <- subset(p_stats, Pos == "SF") # Creates a data frame specific for Small Forwards
pf <- subset(p_stats, Pos == "PF") # Creates a data frame specific for Power Forwards
pg <- subset(p_stats, Pos == "PG") # Creates a data frame specific for Point Guards
write_csv(x = pg, path = "data/processed/pg.csv") # Saves the new PG data into a new computer file.
write_csv(x = sg, path = "data/processed/sg.csv") # Saves the new SG data into a new computer file.
write_csv(x = sf, path = "data/processed/sf.csv") # Saves the new SF data into a new computer file.
write_csv(x = pf, path = "data/processed/pf.csv") # Saves the new PF data into a new computer file.
write_csv(x = centres, path = "data/processed/centres.csv") # Saves the new C data into a new computer file.Point Guard
reg_pg_drb_ppg <- pg %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
ggplot(aes(drbpg, ptspg)) +
geom_point(alpha = 0.5)# Prints the relationship between the new variables 'Defensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_drb_ppgreg_pg_orb_ppg <- pg %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
ggplot(aes(orbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Offensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_orb_ppgreg_pg_stl_ppg <- pg %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
ggplot(aes(stlpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Steals Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_stl_ppgreg_pg_ast_ppg <- pg %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
ggplot(aes(astpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Assists Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_ast_ppg # Prints the above formulareg_pg_x3p_ppg <- pg %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
ggplot(aes(x3ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '3 Points Made Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_x3p_ppg # Prints the above formulareg_pg_x2p_ppg <- pg %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
ggplot(aes(x2ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '2 Points Made Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_x2p_ppg # Prints the above formulareg_pg_ft_ppg <- pg %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
ggplot(aes(ftppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Free Throws Made Per Game' and 'Points Per Game' specific to players listed as Point Guards.
reg_pg_ft_ppg # Prints the above formulaShooting Guard
reg_sg_drb_ppg <- sg %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
ggplot(aes(drbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Defensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_drb_ppg # Prints the above formulareg_sg_orb_ppg <- sg %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
ggplot(aes(orbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Offensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_orb_ppg # Prints the above formulareg_sg_stl_ppg <- sg %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
ggplot(aes(stlpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Steals Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_stl_ppg # Prints the above formulareg_sg_ast_ppg <- sg %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
ggplot(aes(astpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Assists Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_ast_ppg # Prints the above formulareg_sg_x3p_ppg <- sg %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
ggplot(aes(x3ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '3 Points Made Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_x3p_ppg # Prints the above formulareg_sg_x2p_ppg <- sg %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
ggplot(aes(x2ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '2 Points Made Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_x2p_ppg # Prints the above formulareg_sg_ft_ppg <- sg %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
ggplot(aes(ftppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Free Throws Made Per Game' and 'Points Per Game' specific to players listed as Shooting Guards.
reg_sg_ft_ppg # Prints the above formulaSmall Forward
reg_sf_drb_ppg <- sf %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
ggplot(aes(drbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Defensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_drb_ppg # Prints the above formulareg_sf_orb_ppg <- sf %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
ggplot(aes(orbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Offensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_orb_ppg # Prints the above formulareg_sf_stl_ppg <- sf %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
ggplot(aes(stlpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Steals Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_stl_ppg # Prints the above formulareg_sf_ast_ppg <- sf %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
ggplot(aes(astpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Assists Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_ast_ppg # Prints the above formulareg_sf_x3p_ppg <- sf %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
ggplot(aes(x3ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '3 Points Made Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_x3p_ppg # Prints the above formulareg_sf_x2p_ppg <- sf %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
ggplot(aes(x2ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '2 Points Made Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_x2p_ppg # Prints the above formulareg_sf_ft_ppg <- sf %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
ggplot(aes(ftppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Free Throws Made Per Game' and 'Points Per Game' specific to players listed as Small Forwards.
reg_sf_ft_ppg # Prints the above formulaPower Forward
reg_pf_drb_ppg <- pf %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
ggplot(aes(drbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Defensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_drb_ppg # Prints the above formulareg_pf_orb_ppg <- pf %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
ggplot(aes(orbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Offensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_orb_ppg # Prints the above formulareg_pf_stl_ppg <- pf %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
ggplot(aes(stlpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Steals Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_stl_ppg # Prints the above formulareg_pf_ast_ppg <- pf %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
ggplot(aes(astpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Assists Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_ast_ppg # Prints the above formulareg_pf_x3p_ppg <- pf %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
ggplot(aes(x3ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '3 Points Made Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_x3p_ppg # Prints the above formulareg_pf_x2p_ppg <- pf %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
ggplot(aes(x2ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '2 Points Made Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_x2p_ppg # Prints the above formulareg_pf_ft_ppg <- pf %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
ggplot(aes(ftppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Free Throws Made Per Game' and 'Points Per Game' specific to players listed as Power Forwards.
reg_pf_ft_ppg # Prints the above formulaCentres
reg_c_drb_ppg <- centres %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
ggplot(aes(drbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Defensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_drb_ppg # Prints the above formulareg_c_orb_ppg <- centres %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
ggplot(aes(orbpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Offensive Rebounds Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_orb_ppg # Prints the above formulareg_c_stl_ppg <- centres %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
ggplot(aes(stlpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Steals Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_stl_ppg # Prints the above formulareg_c_ast_ppg <- centres %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
ggplot(aes(astpg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Assists Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_ast_ppg # Prints the above formulareg_c_x3p_ppg <- centres %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
ggplot(aes(x3ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '3 Points Made Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_x3p_ppg # Prints the above formulareg_c_x2p_ppg <- centres %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
ggplot(aes(x2ppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables '2 Points Made Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_x2p_ppg # Prints the above formulareg_c_ft_ppg <- centres %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
ggplot(aes(ftppg, ptspg)) +
geom_point(alpha = 0.5) # Prints the relationship between the new variables 'Free Throws Made Per Game' and 'Points Per Game' specific to players listed as Centres.
reg_c_ft_ppg # Prints the above formulaPoint Guard
sum_stats_pg_drb_ppg <- pg %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
summarize(avg_drb = mean(drbpg),
s_drb = sd(drbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(drbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Defensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the Defensive Rebounds made.sum_stats_pg_orb_ppg <- pg %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
summarize(avg_orb = mean(orbpg),
s_orb = sd(orbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(orbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Offensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the Offensive Rebounds made.sum_stats_pg_stl_ppg <- pg %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
summarize(avg_stl = mean(stlpg),
s_stl = sd(stlpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(stlpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Steals Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the Steals made.sum_stats_pg_ast_ppg <- pg %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
summarize(avg_ast = mean(astpg),
s_ast = sd(astpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(astpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Assists Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the Assists made.sum_stats_pg_x3p_ppg <- pg %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
summarize(avg_3p = mean(x3ppg),
s_3p = sd(x3ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x3ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 3 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the 3 point shots made.sum_stats_pg_x2p_ppg <- pg %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
summarize(avg_2p = mean(x2ppg),
s_2p = sd(x2ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x2ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 2 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the 2 point shots made.sum_stats_pg_ft_ppg <- pg %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
summarize(avg_ft = mean(ftppg),
s_ft = sd(ftppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(ftppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Free Throw Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Point Guard may make as well as the Free Throw shots made.Shooting Guard
sum_stats_sg_drb_ppg <- sg %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
summarize(avg_drb = mean(drbpg),
s_drb = sd(drbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(drbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Defensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the Defensive Rebounds made.sum_stats_sg_orb_ppg <- sg %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
summarize(avg_orb = mean(orbpg),
s_orb = sd(orbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(orbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Offensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the Offensive Rebounds made.sum_stats_sg_stl_ppg <- sg %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
summarize(avg_stl = mean(stlpg),
s_stl = sd(stlpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(stlpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Steals Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the Steals made.sum_stats_sg_ast_ppg <- sg %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
summarize(avg_ast = mean(astpg),
s_ast = sd(astpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(astpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Assists Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the Assists made.sum_stats_sg_x3p_ppg <- sg %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
summarize(avg_3p = mean(x3ppg),
s_3p = sd(x3ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x3ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 3 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the 3 point shots made.sum_stats_sg_x2p_ppg <- sg %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
summarize(avg_2p = mean(x2ppg),
s_2p = sd(x2ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x2ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 2 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the 2 point shots made.sum_stats_sg_ft_ppg <- sg %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
summarize(avg_ft = mean(ftppg),
s_ft = sd(ftppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(ftppg, ptspg))# Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Free Throw Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Shooting Guard may make as well as the Free Throw shots made.Small Forward
sum_stats_sf_drb_ppg <- sf %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
summarize(avg_drb = mean(drbpg),
s_drb = sd(drbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(drbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Defensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the Defensive Rebounds made.sum_stats_sf_orb_ppg <- sf %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
summarize(avg_orb = mean(orbpg),
s_orb = sd(orbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(orbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Offensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the Offensive Rebounds made.sum_stats_sf_stl_ppg <- sf %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
summarize(avg_stl = mean(stlpg),
s_stl = sd(stlpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(stlpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Steals Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the Steals made.sum_stats_sf_ast_ppg <- sf %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
summarize(avg_ast = mean(astpg),
s_ast = sd(astpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(astpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Assists Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the Assists made.sum_stats_sf_x3p_ppg <- sf %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
summarize(avg_3p = mean(x3ppg),
s_3p = sd(x3ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x3ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 3 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the 3 point shots made.sum_stats_sf_x2p_ppg <- sf %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
summarize(avg_2p = mean(x2ppg),
s_2p = sd(x2ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x2ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 2 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the 2 point shots made.sum_stats_sf_ft_ppg <- sf %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
summarize(avg_ft = mean(ftppg),
s_ft = sd(ftppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(ftppg, ptspg))# Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Free Throw Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Small Forward may make as well as the Free Throw shots made.Power Forward
sum_stats_pf_drb_ppg <- pf %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
summarize(avg_drb = mean(drbpg),
s_drb = sd(drbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(drbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Defensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the Defensive Rebounds made.sum_stats_pf_orb_ppg <- pf %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
summarize(avg_orb = mean(orbpg),
s_orb = sd(orbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(orbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Offensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the Offensive Rebounds made.sum_stats_pf_stl_ppg <- pf %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
summarize(avg_stl = mean(stlpg),
s_stl = sd(stlpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(stlpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Steals Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the Steals made.sum_stats_pf_ast_ppg <- pf %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
summarize(avg_ast = mean(astpg),
s_ast = sd(astpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(astpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Assists Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the Assists made.sum_stats_pf_x3p_ppg <- pf %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
summarize(avg_3p = mean(x3ppg),
s_3p = sd(x3ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x3ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 3 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the 3 point shots made.sum_stats_pf_x2p_ppg <- pf %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
summarize(avg_2p = mean(x2ppg),
s_2p = sd(x2ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x2ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 2 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the 2 point shots made.sum_stats_pf_ft_ppg <- pf %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
summarize(avg_ft = mean(ftppg),
s_ft = sd(ftppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(ftppg, ptspg))# Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Free Throw Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Power Forward may make as well as the Free Throw shots made.Centres
sum_stats_c_drb_ppg <- centres %>%
mutate(drbpg = DRB / G,
ptspg = PTS / G) %>%
summarize(avg_drb = mean(drbpg),
s_drb = sd(drbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(drbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Defensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the Defensive Rebounds made.sum_stats_c_orb_ppg <- centres %>%
mutate(orbpg = ORB / G,
ptspg = PTS / G) %>%
summarize(avg_orb = mean(orbpg),
s_orb = sd(orbpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(orbpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Offensive Rebounds Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the Offensive Rebounds made.sum_stats_c_stl_ppg <- centres %>%
mutate(stlpg = STL / G,
ptspg = PTS / G) %>%
summarize(avg_stl = mean(stlpg),
s_stl = sd(stlpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(stlpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Steals Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the Steals made.sum_stats_c_ast_ppg <- centres %>%
mutate(astpg = AST / G,
ptspg = PTS / G) %>%
summarize(avg_ast = mean(astpg),
s_ast = sd(astpg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(astpg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Assists Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the Assists made.sum_stats_c_x3p_ppg <- centres %>%
mutate(x3ppg = x3P / G,
ptspg = PTS / G) %>%
summarize(avg_3p = mean(x3ppg),
s_3p = sd(x3ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x3ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 3 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the 3 point shots made.sum_stats_c_x2p_ppg <- centres %>%
mutate(x2ppg = x2P / G,
ptspg = PTS / G) %>%
summarize(avg_2p = mean(x2ppg),
s_2p = sd(x2ppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(x2ppg, ptspg)) # Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team 2 Point Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the 2 point shots made.sum_stats_c_ft_ppg <- centres %>%
mutate(ftppg = FT / G,
ptspg = PTS / G) %>%
summarize(avg_ft = mean(ftppg),
s_ft = sd(ftppg),
avg_ptspg = mean(ptspg),
s_ptspg = sd(ptspg),
r = cor(ftppg, ptspg))# Creates a table visualising the summation of the five summary statistics. Mean and standard deviation of team Free Throw Makes Per Game and Points Per Game. It gives us the ability to predict the amount of points a Centres may make as well as the Free Throw shots made.Point Guards
regline_pg_drb_ppg <- sum_stats_pg_drb_ppg %>%
summarise(slope = r * s_ptspg / s_drb,
intercept = avg_ptspg - slope * avg_drb) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_drb_ppg +
geom_abline(intercept = regline_pg_drb_ppg$intercept, slope = regline_pg_drb_ppg$slope) # Plots the regression line on the graphregline_pg_orb_ppg <- sum_stats_pg_orb_ppg %>%
summarise(slope = r * s_ptspg / s_orb,
intercept = avg_ptspg - slope * avg_orb) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_orb_ppg +
geom_abline(intercept = regline_pg_orb_ppg$intercept, slope = regline_pg_orb_ppg$slope) # Plots the regression line on the graphregline_pg_stl_ppg <- sum_stats_pg_stl_ppg %>%
summarise(slope = r * s_ptspg / s_stl,
intercept = avg_ptspg - slope * avg_stl) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_stl_ppg +
geom_abline(intercept = regline_pg_stl_ppg$intercept, slope = regline_pg_stl_ppg$slope) # Plots the regression line on the graphregline_pg_ast_ppg <- sum_stats_pg_ast_ppg %>%
summarise(slope = r * s_ptspg / s_ast,
intercept = avg_ptspg - slope * avg_ast) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_ast_ppg +
geom_abline(intercept = regline_pg_ast_ppg$intercept, slope = regline_pg_ast_ppg$slope) # Plots the regression line on the graphregline_pg_x3p_ppg <- sum_stats_pg_x3p_ppg %>%
summarise(slope = r * s_ptspg / s_3p,
intercept = avg_ptspg - slope * avg_3p) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_x3p_ppg +
geom_abline(intercept = regline_pg_x3p_ppg$intercept, slope = regline_pg_x3p_ppg$slope, colour = "dodgerblue") # Plots the regression line on the graphregline_pg_x2p_ppg <- sum_stats_pg_x2p_ppg %>%
summarise(slope = r * s_ptspg / s_2p,
intercept = avg_ptspg - slope * avg_2p) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_x2p_ppg +
geom_abline(intercept = regline_pg_x2p_ppg$intercept, slope = regline_pg_x2p_ppg$slope, colour = "orange") # Plots the regression line on the graphregline_pg_ft_ppg <- sum_stats_pg_ft_ppg %>%
summarise(slope = r * s_ptspg / s_ft,
intercept = avg_ptspg - slope * avg_ft) # Use the previous five summary statistic and this formula to create the regression line
reg_pg_ft_ppg +
geom_abline(intercept = regline_pg_ft_ppg$intercept, slope = regline_pg_ft_ppg$slope, colour = "pink") # Plots the regression line on the graphShooting Guard
regline_sg_drb_ppg <- sum_stats_sg_drb_ppg %>%
summarise(slope = r * s_ptspg / s_drb,
intercept = avg_ptspg - slope * avg_drb) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_drb_ppg +
geom_abline(intercept = regline_sg_drb_ppg$intercept, slope = regline_sg_drb_ppg$slope, colour = "green") # Plots the regression line on the graphregline_sg_orb_ppg <- sum_stats_sg_orb_ppg %>%
summarise(slope = r * s_ptspg / s_orb,
intercept = avg_ptspg - slope * avg_orb) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_orb_ppg +
geom_abline(intercept = regline_sg_orb_ppg$intercept, slope = regline_sg_orb_ppg$slope) # Plots the regression line on the graphregline_sg_stl_ppg <- sum_stats_sg_stl_ppg %>%
summarise(slope = r * s_ptspg / s_stl,
intercept = avg_ptspg - slope * avg_stl) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_stl_ppg +
geom_abline(intercept = regline_sg_stl_ppg$intercept, slope = regline_sg_stl_ppg$slope) # Plots the regression line on the graphregline_sg_ast_ppg <- sum_stats_sg_ast_ppg %>%
summarise(slope = r * s_ptspg / s_ast,
intercept = avg_ptspg - slope * avg_ast) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_ast_ppg +
geom_abline(intercept = regline_sg_ast_ppg$intercept, slope = regline_sg_ast_ppg$slope) # Plots the regression line on the graphregline_sg_x3p_ppg <- sum_stats_sg_x3p_ppg %>%
summarise(slope = r * s_ptspg / s_3p,
intercept = avg_ptspg - slope * avg_3p) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_x3p_ppg +
geom_abline(intercept = regline_sg_x3p_ppg$intercept, slope = regline_sg_x3p_ppg$slope) # Plots the regression line on the graphregline_sg_x2p_ppg <- sum_stats_sg_x2p_ppg %>%
summarise(slope = r * s_ptspg / s_2p,
intercept = avg_ptspg - slope * avg_2p) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_x2p_ppg +
geom_abline(intercept = regline_sg_x2p_ppg$intercept, slope = regline_sg_x2p_ppg$slope) # Plots the regression line on the graphregline_sg_ft_ppg <- sum_stats_sg_ft_ppg %>%
summarise(slope = r * s_ptspg / s_ft,
intercept = avg_ptspg - slope * avg_ft) # Use the previous five summary statistic and this formula to create the regression line
reg_sg_ft_ppg +
geom_abline(intercept = regline_sg_ft_ppg$intercept, slope = regline_sg_ft_ppg$slope) # Plots the regression line on the graphSmall Forwards
regline_sf_drb_ppg <- sum_stats_sf_drb_ppg %>%
summarise(slope = r * s_ptspg / s_drb,
intercept = avg_ptspg - slope * avg_drb) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_drb_ppg +
geom_abline(intercept = regline_sf_drb_ppg$intercept, slope = regline_sf_drb_ppg$slope) # Plots the regression line on the graphregline_sf_orb_ppg <- sum_stats_sf_orb_ppg %>%
summarise(slope = r * s_ptspg / s_orb,
intercept = avg_ptspg - slope * avg_orb) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_orb_ppg +
geom_abline(intercept = regline_sf_orb_ppg$intercept, slope = regline_sf_orb_ppg$slope) # Plots the regression line on the graphregline_sf_stl_ppg <- sum_stats_sf_stl_ppg %>%
summarise(slope = r * s_ptspg / s_stl,
intercept = avg_ptspg - slope * avg_stl) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_stl_ppg +
geom_abline(intercept = regline_sf_stl_ppg$intercept, slope = regline_sf_stl_ppg$slope) # Plots the regression line on the graphregline_sf_ast_ppg <- sum_stats_sf_ast_ppg %>%
summarise(slope = r * s_ptspg / s_ast,
intercept = avg_ptspg - slope * avg_ast) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_ast_ppg +
geom_abline(intercept = regline_sf_ast_ppg$intercept, slope = regline_sf_ast_ppg$slope) # Plots the regression line on the graphregline_sf_x3p_ppg <- sum_stats_sf_x3p_ppg %>%
summarise(slope = r * s_ptspg / s_3p,
intercept = avg_ptspg - slope * avg_3p) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_x3p_ppg +
geom_abline(intercept = regline_sf_x3p_ppg$intercept, slope = regline_sf_x3p_ppg$slope) # Plots the regression line on the graphregline_sf_x2p_ppg <- sum_stats_sf_x2p_ppg %>%
summarise(slope = r * s_ptspg / s_2p,
intercept = avg_ptspg - slope * avg_2p) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_x2p_ppg +
geom_abline(intercept = regline_sf_x2p_ppg$intercept, slope = regline_sf_x2p_ppg$slope) # Plots the regression line on the graphregline_sf_ft_ppg <- sum_stats_sf_ft_ppg %>%
summarise(slope = r * s_ptspg / s_ft,
intercept = avg_ptspg - slope * avg_ft) # Use the previous five summary statistic and this formula to create the regression line
reg_sf_ft_ppg +
geom_abline(intercept = regline_sf_ft_ppg$intercept, slope = regline_sf_ft_ppg$slope) # Plots the regression line on the graphCentres
regline_c_drb_ppg <- sum_stats_c_drb_ppg %>%
summarise(slope = r * s_ptspg / s_drb,
intercept = avg_ptspg - slope * avg_drb) # Use the previous five summary statistic and this formula to create the regression line
reg_c_drb_ppg +
geom_abline(intercept = regline_c_drb_ppg$intercept, slope = regline_c_drb_ppg$slope) # Plots the regression line on the graphregline_c_orb_ppg <- sum_stats_c_orb_ppg %>%
summarise(slope = r * s_ptspg / s_orb,
intercept = avg_ptspg - slope * avg_orb) # Use the previous five summary statistic and this formula to create the regression line
reg_c_orb_ppg +
geom_abline(intercept = regline_c_orb_ppg$intercept, slope = regline_c_orb_ppg$slope) # Plots the regression line on the graphregline_c_stl_ppg <- sum_stats_c_stl_ppg %>%
summarise(slope = r * s_ptspg / s_stl,
intercept = avg_ptspg - slope * avg_stl) # Use the previous five summary statistic and this formula to create the regression line
reg_c_stl_ppg +
geom_abline(intercept = regline_c_stl_ppg$intercept, slope = regline_c_stl_ppg$slope) # Plots the regression line on the graphregline_c_ast_ppg <- sum_stats_c_ast_ppg %>%
summarise(slope = r * s_ptspg / s_ast,
intercept = avg_ptspg - slope * avg_ast) # Use the previous five summary statistic and this formula to create the regression line
reg_c_ast_ppg +
geom_abline(intercept = regline_c_ast_ppg$intercept, slope = regline_c_ast_ppg$slope) # Plots the regression line on the graphregline_c_x3p_ppg <- sum_stats_c_x3p_ppg %>%
summarise(slope = r * s_ptspg / s_3p,
intercept = avg_ptspg - slope * avg_3p) # Use the previous five summary statistic and this formula to create the regression line
reg_c_x3p_ppg +
geom_abline(intercept = regline_c_x3p_ppg$intercept, slope = regline_c_x3p_ppg$slope) # Plots the regression line on the graphregline_c_x2p_ppg <- sum_stats_c_x2p_ppg %>%
summarise(slope = r * s_ptspg / s_2p,
intercept = avg_ptspg - slope * avg_2p) # Use the previous five summary statistic and this formula to create the regression line
reg_c_x2p_ppg +
geom_abline(intercept = regline_c_x2p_ppg$intercept, slope = regline_c_x2p_ppg$slope) # Plots the regression line on the graphregline_c_ft_ppg <- sum_stats_c_ft_ppg %>%
summarise(slope = r * s_ptspg / s_ft,
intercept = avg_ptspg - slope * avg_ft) # Use the previous five summary statistic and this formula to create the regression line
reg_c_ft_ppg +
geom_abline(intercept = regline_c_ft_ppg$intercept, slope = regline_c_ft_ppg$slope) # Plots the regression line on the graphPoints = 3-Point FG + 2-Point FG + Free Throw
x3pts_ppg <- ggplot(data = p_stats, aes(x = x3P, y = PTS)) +
geom_point(colour = "dodgerblue") +
geom_smooth(method = "lm", colour = "magenta") + # 3pt FG makes to total points
theme_linedraw()
x3pts_ppg # Prints the graph with a regression line## `geom_smooth()` using formula 'y ~ x'
x2pts_ppg <- ggplot(data = p_stats, aes(x = x2P, y = PTS)) +
geom_point(colour = "dodgerblue") +
geom_smooth(method = "lm", colour = "magenta") + # 2pt FG makes to total points
theme_light()
x2pts_ppg # Prints the graph with a regression line## `geom_smooth()` using formula 'y ~ x'
ftpts_ppg <- ggplot(data = p_stats, aes(x = FT, y = PTS)) +
geom_point(colour = "dodgerblue") +
geom_smooth(method = "lm", colour = "magenta") + # FT makes to total points
theme_dark()
ftpts_ppg # Prints the graph with a regression line## `geom_smooth()` using formula 'y ~ x'
Combined Players
overall_tidy_combined <- lm(PTSpm ~ MP + x3P + x2P + FT, data = p_stats)
tidy(overall_tidy_combined, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points per minute.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.382 0.00622 61.4 2.80e-198 0.370 0.394
## 2 MP -0.000187 0.00000924 -20.2 4.18e- 62 -0.000205 -0.000169
## 3 x3P 0.00140 0.0000778 18.0 9.47e- 53 0.00125 0.00155
## 4 x2P 0.00105 0.0000538 19.6 1.97e- 59 0.000948 0.00116
## 5 FT 0.000319 0.0000594 5.37 1.42e- 7 0.000202 0.000435
overall_tidy_ppg <- lm(PPG ~ G + x3P + x2P + FT, data = p_stats)
tidy(overall_tidy_ppg, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 6.63 0.253 26.2 5.24e-87 6.13 7.12
## 2 G -0.100 0.00542 -18.5 8.34e-55 -0.111 -0.0896
## 3 x3P 0.0427 0.00150 28.4 8.04e-96 0.0397 0.0456
## 4 x2P 0.0297 0.00106 27.9 7.56e-94 0.0276 0.0317
## 5 FT 0.0136 0.00146 9.31 1.09e-18 0.0107 0.0165
overall_tidy_pts <- lm(PTS ~ AST + DRB + ORB + BLK + TOV, data = p_stats)
tidy(overall_tidy_pts, conf.int = TRUE)## # A tibble: 6 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 40.0 18.5 2.16 3.12e- 2 3.63 76.3
## 2 AST -0.272 0.181 -1.50 1.34e- 1 -0.627 0.0842
## 3 DRB 0.842 0.166 5.09 5.80e- 7 0.517 1.17
## 4 ORB -0.632 0.315 -2.00 4.58e- 2 -1.25 -0.0119
## 5 BLK 0.533 0.524 1.02 3.10e- 1 -0.498 1.56
## 6 TOV 6.33 0.463 13.7 8.22e-35 5.42 7.24
Point Guard
pg_overall_tidy <- lm(PTSpm ~ MP + x3P + x2P + FT, data = pg)
tidy(pg_overall_tidy, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points per minute, specific to Point Guards. ## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.363 0.0142 25.4 1.23e-38 0.334 0.391
## 2 MP -0.000162 0.0000196 -8.29 3.41e-12 -0.000201 -0.000123
## 3 x3P 0.00125 0.000150 8.33 2.83e-12 0.000952 0.00155
## 4 x2P 0.00102 0.000112 9.13 8.48e-14 0.000800 0.00125
## 5 FT 0.000264 0.000107 2.47 1.56e- 2 0.0000514 0.000477
pg_overall_tidy_ppg <- lm(PPG ~ G + x3P + x2P + FT, data = pg)
tidy(pg_overall_tidy_ppg, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points, specific to Point Guards## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 7.10 0.630 11.3 8.59e-18 5.85 8.36
## 2 G -0.106 0.0136 -7.80 2.90e-11 -0.133 -0.0790
## 3 x3P 0.0423 0.00397 10.6 1.20e-16 0.0344 0.0502
## 4 x2P 0.0318 0.00252 12.7 2.70e-20 0.0268 0.0368
## 5 FT 0.00954 0.00306 3.11 2.61e- 3 0.00344 0.0156
pg_overall_tidy_pts <- lm(PTS ~ AST + DRB + ORB + BLK + TOV, data = pg)
tidy(pg_overall_tidy_pts, conf.int = TRUE) # Values players based on successful actions and their relative contribution to external actions not involved with points, specific to Point Guards.## # A tibble: 6 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 17.0 50.1 0.339 0.735 -82.8 117.
## 2 AST 0.228 0.430 0.530 0.598 -0.628 1.08
## 3 DRB 0.152 0.616 0.246 0.806 -1.08 1.38
## 4 ORB -1.15 2.11 -0.543 0.589 -5.36 3.06
## 5 BLK 2.85 3.12 0.911 0.365 -3.38 9.07
## 6 TOV 5.78 0.917 6.30 0.0000000191 3.95 7.61
Shooting Guard
sg_overall_tidy <- lm(PTSpm ~ MP + x3P + x2P + FT, data = sg)
tidy(sg_overall_tidy, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points per minute, specific to Shooting Guards.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.376 0.0119 31.6 1.77e-49 0.353 0.400
## 2 MP -0.000173 0.0000181 -9.56 3.19e-15 -0.000210 -0.000137
## 3 x3P 0.00145 0.000160 9.04 3.76e-14 0.00113 0.00177
## 4 x2P 0.000842 0.000112 7.51 4.78e-11 0.000619 0.00107
## 5 FT 0.000494 0.000154 3.21 1.87e- 3 0.000188 0.000800
sg_overall_tidy_ppg <- lm(PPG ~ G + x3P + x2P + FT, data = sg)
tidy(sg_overall_tidy_ppg, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points, specific to Shooting Guards.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 5.87 0.486 12.1 2.84e-20 4.90 6.83
## 2 G -0.0860 0.0109 -7.86 9.58e-12 -0.108 -0.0643
## 3 x3P 0.0428 0.00312 13.7 1.96e-23 0.0365 0.0490
## 4 x2P 0.0253 0.00258 9.82 9.48e-16 0.0202 0.0304
## 5 FT 0.0190 0.00382 4.98 3.13e- 6 0.0115 0.0266
sg_overall_tidy_pts <- lm(PTS ~ AST + DRB + ORB + BLK + TOV, data = sg)
tidy(sg_overall_tidy_pts, conf.int = TRUE) # Values players based on successful actions and their relative contribution to external actions not involved with points, specific to Shooting Guards.## # A tibble: 6 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -13.8 38.5 -0.359 0.720 -90.4 62.7
## 2 AST 0.246 0.474 0.519 0.605 -0.696 1.19
## 3 DRB 2.20 0.462 4.76 0.00000781 1.28 3.11
## 4 ORB 1.99 1.46 1.36 0.177 -0.912 4.88
## 5 BLK -3.29 2.20 -1.49 0.139 -7.67 1.09
## 6 TOV 4.32 0.946 4.56 0.0000167 2.44 6.20
Small Forward
sf_overall_tidy <- lm(PTSpm ~ MP + x3P + x2P + FT, data = sf)
tidy(sf_overall_tidy, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points per minute, specific to Small Forwards.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.346 0.0119 29.0 1.25e-34 0.322 0.370
## 2 MP -0.000163 0.0000192 -8.49 1.63e-11 -0.000201 -0.000124
## 3 x3P 0.00131 0.000230 5.69 5.36e- 7 0.000848 0.00177
## 4 x2P 0.00106 0.000132 8.03 8.68e-11 0.000794 0.00132
## 5 FT 0.000294 0.000152 1.94 5.82e- 2 -0.0000106 0.000600
sf_overall_tidy_ppg <- lm(PPG ~ G + x3P + x2P + FT, data = sf)
tidy(sf_overall_tidy_ppg, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points, specific to Small Forwards## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 7.34 0.569 12.9 4.05e-18 6.20 8.48
## 2 G -0.117 0.0125 -9.36 6.69e-13 -0.143 -0.0923
## 3 x3P 0.0438 0.00535 8.18 5.03e-11 0.0331 0.0545
## 4 x2P 0.0364 0.00335 10.9 3.37e-15 0.0296 0.0431
## 5 FT 0.00819 0.00456 1.80 7.79e- 2 -0.000946 0.0173
sf_overall_tidy_pts <- lm(PTS ~ AST + DRB + ORB + BLK + TOV, data = sf)
tidy(sf_overall_tidy_pts, conf.int = TRUE) # Values players based on successful actions and their relative contribution to external actions not involved with points, specific to Small Forwards## # A tibble: 6 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -73.4 43.6 -1.68 0.0982 -161. 14.1
## 2 AST -0.642 0.532 -1.21 0.233 -1.71 0.425
## 3 DRB 1.77 0.504 3.51 0.000920 0.758 2.78
## 4 ORB -0.497 1.06 -0.468 0.642 -2.63 1.63
## 5 BLK 0.833 1.58 0.528 0.600 -2.33 4.00
## 6 TOV 6.12 1.31 4.67 0.0000212 3.49 8.75
Power Forward
pf_overall_tidy <- lm(PTSpm ~ MP + x3P + x2P + FT, data = pf)
tidy(pf_overall_tidy, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points per minute, specific to Power Forwards.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.413 0.0177 23.4 3.03e-34 0.378 0.448
## 2 MP -0.000246 0.0000345 -7.13 8.41e-10 -0.000314 -0.000177
## 3 x3P 0.00184 0.000395 4.66 1.54e- 5 0.00105 0.00263
## 4 x2P 0.00125 0.000207 6.06 6.79e- 8 0.000840 0.00167
## 5 FT 0.000295 0.000238 1.24 2.19e- 1 -0.000179 0.000770
pf_overall_tidy_ppg <- lm(PPG ~ G + x3P + x2P + FT, data = pf)
tidy(pf_overall_tidy_ppg, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points, specific to Power Forwards. ## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 6.24 0.641 9.74 1.61e-14 4.96 7.52
## 2 G -0.101 0.0155 -6.51 1.07e- 8 -0.132 -0.0700
## 3 x3P 0.0479 0.00564 8.49 2.79e-12 0.0366 0.0592
## 4 x2P 0.0288 0.00314 9.16 1.74e-13 0.0225 0.0351
## 5 FT 0.0152 0.00454 3.34 1.35e- 3 0.00612 0.0242
pf_overall_tidy_pts <- lm(PTS ~ AST + DRB + ORB + BLK + TOV, data = pf)
tidy(pf_overall_tidy_pts, conf.int = TRUE) # Values players based on successful actions and their relative contribution to external actions not involved with points, specific to Power Fowards## # A tibble: 6 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -23.2 30.1 -0.772 4.43e- 1 -83.3 36.8
## 2 AST -1.51 0.395 -3.83 2.81e- 4 -2.30 -0.725
## 3 DRB 1.25 0.257 4.88 6.96e- 6 0.741 1.77
## 4 ORB 0.0810 0.544 0.149 8.82e- 1 -1.01 1.17
## 5 BLK 0.0720 0.929 0.0775 9.38e- 1 -1.78 1.93
## 6 TOV 6.74 0.910 7.40 2.84e-10 4.92 8.55
Centres
c_overall_tidy <- lm(PTSpm ~ MP + x3P + x2P + FT, data = centres)
tidy(c_overall_tidy, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points per minute, specific to Centres.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 0.400 0.0122 32.8 3.56e-42 0.376 0.425
## 2 MP -0.000225 0.0000210 -10.7 4.89e-16 -0.000267 -0.000183
## 3 x3P 0.00175 0.000198 8.82 1.03e-12 0.00135 0.00214
## 4 x2P 0.00116 0.000107 10.8 3.19e-16 0.000947 0.00137
## 5 FT 0.000367 0.000108 3.38 1.22e- 3 0.000150 0.000584
c_overall_tidy_ppg <- lm(PPG ~ G + x3P + x2P + FT, data = centres)
tidy(c_overall_tidy_ppg, conf.int = TRUE) # Values players based on successful actions and their relative contribution to scoring points, specific to Centres.## # A tibble: 5 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 7.35 0.568 12.9 1.17e-19 6.22 8.49
## 2 G -0.115 0.0116 -9.90 1.31e-14 -0.138 -0.0919
## 3 x3P 0.0433 0.00452 9.58 4.77e-14 0.0343 0.0524
## 4 x2P 0.0282 0.00192 14.7 2.13e-22 0.0244 0.0321
## 5 FT 0.0177 0.00286 6.20 4.36e- 8 0.0120 0.0234
c_overall_tidy_pts <- lm(PTS ~ AST + DRB + ORB + BLK + TOV, data = centres)
tidy(c_overall_tidy_pts, conf.int = TRUE) # Values players based on successful actions and their relative contribution to external actions not involved with points, specific to Centres## # A tibble: 6 x 7
## term estimate std.error statistic p.value conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) -58.3 36.8 -1.58 0.118 -132. 15.3
## 2 AST 0.662 0.455 1.46 0.150 -0.246 1.57
## 3 DRB 0.0150 0.269 0.0556 0.956 -0.523 0.552
## 4 ORB 0.654 0.367 1.78 0.0798 -0.0798 1.39
## 5 BLK 2.44 0.534 4.57 0.0000231 1.37 3.50
## 6 TOV 5.06 0.982 5.16 0.00000264 3.10 7.03
#Player Points Predictions by Position
Point Guard
pg_ppg_pts_predict <- pg %>%
filter(PPG >= 10) %>% # This will filter out Point Guards who score under 10 Points Per Game
mutate(pg_r_hat = predict(pg_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(pg_r_hat, PTS, label = player_name)) +
geom_point() +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() # This formula will predict each Point Guards predicted Points Per Game against their actual output
pg_ppg_pts_predict # Prints the output## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Shooting Guard
sg_ppg_pts_predict <- sg %>%
filter(PPG >= 10) %>% # This will filter out Shooting Guards who score under 10 Points Per Game
mutate(sg_r_hat = predict(sg_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(sg_r_hat, PTS, label = player_name)) +
geom_point() +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() # This formula will predict each Shooting Guards predicted Points Per Game against their actual output
sg_ppg_pts_predict # Prints the output## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Small Forward
sf_ppg_pts_predict <- sf %>%
filter(PPG >= 10) %>% # This will filter out Small Forwards who score under 10 Points Per Game
mutate(sf_r_hat = predict(sf_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(sf_r_hat, PTS, label = player_name)) +
geom_point() +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() # This formula will predict each Small Forwards predicted Points Per Game against their actual output
sf_ppg_pts_predict # Prints the output## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Power Forward
pf_ppg_pts_predict <- pf %>%
filter(PPG >= 10) %>% # This will filter out Power Forwards who score under 10 Points Per Game
mutate(pf_r_hat = predict(pf_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(pf_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() + # This formula will predict each Power Forwards predicted Points Per Game against their actual output
labs(title = "Successful Free Throw Point Attempts and Overall Points",
subtitle = "Identifying a Relationship between Free Throws and Total Points",
x = "Successful Free Throws",
y = "Total Points") + # Changing and configuring the Titles of the graph.
theme_linedraw()
pf_ppg_pts_predict # Prints the output## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Centres
c_ppg_pts_predict <- centres %>%
filter(PPG >= 10) %>% # This will filter out Centres who score under 10 Points Per Game
mutate(c_r_hat = predict(c_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(c_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() # This formula will predict each Centres predicted Points Per Game against their actual output
c_ppg_pts_predict # Prints the output## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Point Guard
pg_pts_hat <- pg %>%
mutate(pg_hat = predict(pg_overall_tidy_ppg, newdata = .))
pg_salary <- pg_pts_hat %>%
ggplot(aes(Salary, pg_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1, cex = 2) # Compares and plots individual Player Salaries against their predicted Points Per Game to identify value
pg_salary # Gives the predicted points average against the players salary, to see where the value is. We select D'Angelo Russell or Kemba Walker at Point Guard. Shooting Guard
sg_pts_hat <- sg %>%
mutate(sg_hat = predict(sg_overall_tidy_ppg, newdata = .))
sg_salary <- sg_pts_hat %>%
ggplot(aes(Salary, sg_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) # Compares and plots individual Player Salaries against their predicted Points Per Game to identify value
sg_salary # Gives the predicted points average against the players salary, to see where the value is. We select Donovan Mitchell or Devin Booker at Shooting Guard. Small Forward
sf_pts_hat <- sf %>%
mutate(sf_hat = predict(sf_overall_tidy_ppg, newdata = .))
sf_salary <- sf_pts_hat %>%
ggplot(aes(Salary, sf_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) # Compares and plots individual Player Salaries against their predicted Points Per Game to identify value
sf_salary # Gives the predicted points average against the players salary, to see where the value is. Depending on salary space, we either pick Kevin Durant or Kawhi Leonard at Small Forward as they stand out from the pack of lower priced players.Power Forward
pf_pts_hat <- pf %>%
mutate(pf_hat = predict(pf_overall_tidy_ppg, newdata = .))
pf_salary <- pf_pts_hat %>%
ggplot(aes(Salary, pf_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) # Compares and plots individual Player Salaries against their predicted Points Per Game to identify value
pf_salary # Gives the predicted points average against the players salary, to see where the value is. We select either Julius Randle or Tobias Harris at the Power Forward position. Centres
c_pts_hat <- centres %>%
mutate(c_hat = predict(c_overall_tidy_ppg, newdata = .))
c_salary <- c_pts_hat %>%
ggplot(aes(Salary, c_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) # Compares and plots individual Player Salaries against their predicted Points Per Game to identify value
c_salary # Gives the predicted points average against the players salary, to see where the value is. We select Karl-Anthony Towns at Centre. selected_player_pg <- pg %>%
filter(player_name == "D'Angelo Russell") # Filter for the chosen PG
selected_player_pg_2 <- pg %>%
filter(player_name == "Kemba Walker") # Filter for the chosen PG
selected_pg <- bind_rows(selected_player_pg, selected_player_pg_2) # Combine the two selected PG players for a comparison to make a final decision.
selected_player_sg <- sg %>%
filter(player_name == "Donovan Mitchell") # Filter for the chosen SG
selected_player_sg_2 <- sg %>%
filter(player_name == "Devin Booker") # Filter for the chosen SG
selected_sg <- bind_rows(selected_player_sg, selected_player_sg_2) # Combine the two selected SG players for a comparison to make a final decision.
selected_player_sf <- sf %>%
filter(player_name == "Kevin Durant") # Filter for the chosen SF
selected_player_sf_2 <- sf %>%
filter(player_name == "Kawhi Leonard") # Filter for the chosen SF
selected_sf <- bind_rows(selected_player_sf, selected_player_sf_2) # Combine the two selected SF players for a comparison to make a final decision.
selected_player_pf <- pf %>%
filter(player_name == "Giannis Antetokounmpo") # Filter for the chosen PF
selected_player_pf_2 <- pf %>%
filter(player_name == "Julius Randle") # Filter for the chosen PF
selected_pf <- bind_rows(selected_player_pf, selected_player_pf_2) # Combine the two selected PF players for a comparison to make a final decision.
selected_player_c <- centres %>%
filter(player_name == "Karl-Anthony Towns") # Filter for the chosen C# Offence Rating v Wins
final_of_wins <- comb_team %>%
ggplot(aes(ORtg, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "blue") +
labs(title = "Offensive Rating relative to Wins",
subtitle = "A Linear Relationship Explored",
caption = "Plotting Team Offensive Ratings and their correlating Wins",
x = "Team Offensive Ratings",
y = "Season Wins") # Changing and configuring the Titles of the graph.
of_wins## `geom_smooth()` using formula 'y ~ x'
# Defence Rating v Wins
final_def_wins <- comb_team %>%
ggplot(aes(DRtg, W)) +
geom_point() +
geom_smooth(method = "lm", colour = "orange") +
labs(title = "Defensive Rating relative to Wins",
subtitle = "A Linear Relationship",
caption = "Plotting Team Defensive Ratings and their correlating Wins",
x = "Team Defensive Ratings",
y = "Season Wins") # Changing and configuring the Titles of the graph.
# Testing for a linear relationship between Defensive Rating and Wins. Important to note that the negative linear relationship is actually reversed, as a higher defensive rating is not a good outcome for the team. A higher win rate and lower defensive rating is the ideal outcome.
def_wins ## `geom_smooth()` using formula 'y ~ x'
# The Offence Residuals
final_of_points <- ggplot(data = NULL, aes(x = of_wins_points, y = of_wins_stdres)) +
geom_point(colour = "red") +
geom_text(aes(label = of_wins_labels), nudge_y = 0.3) +
ylim(c(-4, 4)) +
geom_hline(yintercept = c(-3, 3), colour = "darkgreen", linetype = "dashed") +
labs(title = "Offensive Rating - Potential Leverage",
subtitle = "A Linear Relationship Explored",
caption = "Plotting Team Offensive Ratings and their leverage potential",
x = "Points of Offensive Rating Residuals",
y = "Standardised Residuals") # Changing and configuring the Titles of the graph.
final_of_points# Leverage of Offensive Rating
final_of_lev <- ggplot(data = NULL, aes(x = of_wins_points, y = of_wins_hats)) +
geom_point(colour = "dodgerblue") +
labs(title = "Measuring Offensive Rating Leverage relative to Wins",
subtitle = "A Linear Relationship Explored",
caption = "Measuring hat values to determine leverage points",
x = "Points of Offensive Rating Residuals",
y = "Hat Values of Linear Model") # Changing and configuring the Titles of the graph.
final_of_lev# Collective Change in the Coefficients of Offensive Rating and Wins
final_of_cook <- ggplot(data = NULL, aes(x = of_wins_points, y = of_wins_cook))+
geom_point(colour = "tomato2") +
labs(title = "Collective Change in Coefficients of Offensive Rating and Wins",
subtitle = "A Linear Relationship Explored",
caption = "Plotting the Coefficient Changes",
x = "Points of Offensive Rating Residuals",
y = "Change in Coefficient") # Changing and configuring the Titles of the graph.
final_of_cook# Homoscedasticity
final_of_h <- ggplot(data = NULL, aes(x = of_wins_fitted, y = of_wins_res)) +
geom_point(colour = "dodgerblue") +
geom_hline(yintercept = 0, colour = "red", linetype = "dashed") +
labs(title = "Offensive Rating relative to Wins",
subtitle = "A Linear Relationship Explored",
caption = "Measuring the constant variance of Predicted Wins",
x = "Predicted Wins Relative to Offensive Rating",
y = "Residuals of Wins Relative to Offensive Rating") # Changing and configuring the Titles of the graph.
final_of_h# Value in Predicted Output
## Centres
final_c_salary <- c_pts_hat %>%
ggplot(aes(Salary, c_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) + # Gives the predicted points average against the players salary, to see where the value is. We select Karl-Anthony Towns at Centre.
geom_hline(yintercept = 20, colour = "black", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "black", linetype = "dotted") + # Divides the graph into sections to indicate where players may be overpriced for their output
labs(title = "Predicted Points and Current Salary for Centres",
subtitle = "Finding Value in the Centre Position",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Predicted Points Per Game",
colour = "Player Position")
final_c_salary## Power Forward
final_pf_salary <- pf_pts_hat %>%
ggplot(aes(Salary, pf_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) + # Gives the predicted points average against the players salary, to see where the value is. We select either Julius Randle or Tobias Harris at the Power Forward position.
geom_hline(yintercept = 20, colour = "dodgerblue", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "dodgerblue", linetype = "dotted") + # Divides the graph into sections to indicate where players may be overpriced for their output
labs(title = "Predicted Points and Current Salary for Power Forwards",
subtitle = "Finding Value in the Power Forward Position",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Predicted Points Per Game",
colour = "Player Position")
final_pf_salary## Small Forward
final_sf_salary <- sf_pts_hat %>%
ggplot(aes(Salary, sf_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) + # Gives the predicted points average against the players salary, to see where the value is. Depending on salary space, we either pick Kevin Durant or Kawhi Leonard at Small Forward as they stand out from the pack of lower priced players.
geom_hline(yintercept = 20, colour = "darkorchid2", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "darkorchid2", linetype = "dotted") + # Divides the graph into sections to indicate where players may be overpriced for their output
labs(title = "Predicted Points and Current Salary for Small Forwards",
subtitle = "Finding Value in the Small Forward Position",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Predicted Points Per Game",
colour = "Player Position")
final_sf_salary## Shooting Guard
final_sg_salary <- sg_pts_hat %>%
ggplot(aes(Salary, sg_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1.5, cex = 2) + # Gives the predicted points average against the players salary, to see where the value is. We select Donovan Mitchell at Shooting Guard.
geom_hline(yintercept = 20, colour = "cyan", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "cyan", linetype = "dotted") + # Divides the graph into sections to indicate where players may be overpriced for their output
labs(title = "Predicted Points and Current Salary for Shooting Guards",
subtitle = "Finding Value in the Shooting Guard Position",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Predicted Points Per Game",
colour = "Player Position")
final_sg_salary## Point Guard
final_pg_salary <- pg_pts_hat %>%
ggplot(aes(Salary, pg_hat, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1, cex = 2) + # Gives the predicted points average against the players salary, to see where the value is. We select D'Angelo Russell at Point Guard.
geom_hline(yintercept = 20, colour = "navy", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "navy", linetype = "dotted") + # Divides the graph into sections to indicate where players may be overpriced for their output
labs(title = "Predicted Points and Current Salary for Point Guards",
subtitle = "Finding Value in the Point Guard Position",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Predicted Points Per Game",
colour = "Player Position")
final_pg_salary## Linear Relationship for Points
# Points = 3-Point FG + 2-Point FG + Free Throw
# 3 Point Relationship
final_x3pts_ppg <- ggplot(data = p_stats, aes(x = x3P, y = PTS)) +
geom_point(colour = "dodgerblue") +
geom_smooth(method = "lm", colour = "magenta") + # 3pt FG makes to total points
labs(title = "Successful 3 Point Attempts and Overall Points",
subtitle = "Identifying a Relationship between 3 Point Baskets and Total Points",
x = "Successful 3 Point Baskets",
y = "Total Points") + # Changing and configuring the Titles of the graph.
theme_linedraw()
# 2 Point Relationship
final_x2pts_ppg <- ggplot(data = p_stats, aes(x = x2P, y = PTS)) +
geom_point(colour = "dodgerblue") +
geom_smooth(method = "lm", colour = "magenta") + # 2pt FG makes to total points
labs(title = "Successful 2 Point Attempts and Overall Points",
subtitle = "Identifying a Relationship between 2 Point Baskets and Total Points",
x = "Successful 2 Point Baskets",
y = "Total Points") + # Changing and configuring the Titles of the graph.
theme_linedraw()
# Free Throw Relationship
final_ftpts_ppg <- ggplot(data = p_stats, aes(x = FT, y = PTS)) +
geom_point(colour = "dodgerblue") +
geom_smooth(method = "lm", colour = "magenta") + # FT makes to total points
labs(title = "Successful Free Throw Point Attempts and Overall Points",
subtitle = "Identifying a Relationship between Free Throws and Total Points",
x = "Successful Free Throws",
y = "Total Points") + # Changing and configuring the Titles of the graph.
theme_linedraw()
# Points Per Game Value
combined_ppgsalary_value <- p_stats %>%
ggplot(aes(Salary, PPG, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = 1, cex = 2) +
geom_hline(yintercept = 20, colour = "navy", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "navy", linetype = "dotted") +
scale_y_continuous(limits = c(10,35)) +
labs(title = "Current Points Per Game",
subtitle = "Finding Value in the NBA",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Points Per Game",
colour = "Player Position") # Changing and configuring the Titles of the graph.
combined_ppgsalary_value## Warning: Removed 234 rows containing missing values (geom_point).
## Warning: Removed 234 rows containing missing values (geom_text).
# Rebounds Per Game Value
combined_rpgsalary_value <- p_stats %>%
ggplot(aes(Salary, RPG, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = .5, cex = 2) +
geom_hline(yintercept = 7, colour = "navy", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "navy", linetype = "dotted") +
scale_y_continuous(limits = c(5,17)) +
labs(title = "Current Total Rebounds Per Game",
subtitle = "Finding Value in the NBA",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Rebounds Per Game",
colour = "Player Position") # Changing and configuring the Titles of the graph.
combined_rpgsalary_value## Warning: Removed 284 rows containing missing values (geom_point).
## Warning: Removed 284 rows containing missing values (geom_text).
# Assists Per Game Value
combined_astsalary_value <- p_stats %>%
ggplot(aes(Salary, APG, colour = Pos, label = player_name)) +
geom_point() +
geom_text(nudge_y = .5, cex = 2) +
geom_hline(yintercept = 7, colour = "navy", linetype = "dotted") +
geom_vline(xintercept = 15000000, colour = "navy", linetype = "dotted") +
scale_y_continuous(limits = c(4,12)) +
labs(title = "Current Assists Per Game",
subtitle = "Finding Value in the NBA",
caption = "Finding a cheaper alternative with similar predicted output",
x = "Player Salary",
y = "Assists Per Game",
colour = "Player Position") + # Changing and configuring the Titles of the graph.
theme_bw() # Change the theme of the graph
combined_astsalary_value## Warning: Removed 330 rows containing missing values (geom_point).
## Warning: Removed 330 rows containing missing values (geom_text).
# Player Points Predictions by Position
## Point Guard
final_pg_ppg_pts_predict <- pg %>%
filter(PPG >= 10) %>% # This will filter out Point Guards who score under 10 Points Per Game
mutate(pg_r_hat = predict(pg_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(pg_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() + # This formula will predict each Point Guards predicted Points Per Game against their actual output
labs(title = "Predicted in comparison to Actual Points Output",
subtitle = "Identifying Point Guards who perform to expectated levels",
x = "Predicted Points Per Game Average",
y = "Total Points") + # Changing and configuring the Titles of the graph.
theme_linedraw() # Changing the theme of the graph.
## Shooting Guard
final_sg_ppg_pts_predict <- sg %>%
filter(PPG >= 10) %>% # This will filter out Shooting Guards who score under 10 Points Per Game
mutate(sg_r_hat = predict(sg_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(sg_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() + # This formula will predict each Shooting Guards predicted Points Per Game against their actual output
labs(title = "Predicted in comparison to Actual Points Output",
subtitle = "Identifying Shooting Guards who perform to expectated levels",
x = "Predicted Points Per Game Average",
y = "Total Points") + # Changing and configuring the Titles of the graph
theme_linedraw() # Changing the theme of the graph.
## Small Forward
final_sf_ppg_pts_predict <- sf %>%
filter(PPG >= 10) %>% # This will filter out Small Forwards who score under 10 Points Per Game
mutate(sf_r_hat = predict(sf_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(sf_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() + # This formula will predict each Small Forwards predicted Points Per Game against their actual output
labs(title = "Predicted in comparison to Actual Points Output",
subtitle = "Identifying Small Forwards who perform to expectated levels",
x = "Predicted Points Per Game Average",
y = "Total Points") + # Changing and configuring the Titles of the graph
theme_linedraw() # Changing the theme of the graph.
## Power Forward
final_pf_ppg_pts_predict <- pf %>%
filter(PPG >= 10) %>% # This will filter out Power Forwards who score under 10 Points Per Game
mutate(pf_r_hat = predict(pf_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(pf_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() + # This formula will predict each Power Forwards predicted Points Per Game against their actual output
labs(title = "Predicted in comparison to Actual Points Output",
subtitle = "Identifying Power Forwards who perform to expectated levels",
x = "Predicted Points Per Game Average",
y = "Total Points") + # Changing and configuring the Titles of the graph
theme_linedraw() # Changing the theme of the graph.
## Centre
final_c_ppg_pts_predict <- centres %>%
filter(PPG >= 10) %>% # This will filter out Centres who score under 10 Points Per Game
mutate(c_r_hat = predict(c_overall_tidy_ppg, newdata = .)) %>%
ggplot(aes(c_r_hat, PTS, label = player_name)) +
geom_point(colour = "Red", alpha = 0.8) +
geom_text(nudge_x = 2, cex = 2) +
geom_smooth() + # This formula will predict each Centres predicted Points Per Game against their actual output
labs(title = "Predicted in comparison to Actual Points Output",
subtitle = "Identifying Centres who perform to expectated levels",
x = "Predicted Points Per Game Average",
y = "Total Points") + # Changing and configuring the Titles of the graph
theme_linedraw() # Changing the theme of the graph.selected_team <- bind_rows(selected_player_c, selected_player_pf_2, selected_player_sf, selected_player_sg, selected_player_pg_2) # Combine above filtered players and combine to express the chosen side.
datatable(selected_team, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T)) # The table describes the ideal team. Selection based on predicted value and current value based on output per game. Used the datatable funtion to include a search bar